Sunday, 26 June 2016

Sqooping from/to MYSQL and HDFS

Sqoop Import
From MYSQL table to HDFS

sqoop import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --target-dir '/user/horton/exported' -m 1

From MYSQL table to Hive

sqoop import --hive-import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --hive-table test.exported3 -m 1

From MYSQL table to Hive schema defined.

sqoop import --hive-import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --hive-table test.exported3 -m 1 --map-column-hive 'row_key=int,value=string,ts=timestamp';

From MYSQL table to Hive with delimiters

sqoop import --hive-import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --hive-table test.exported2 -m 1 --fields-terminated-by ','

Free form query import

sqoop import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --query 'select * from exported where $CONDITIONS' --target-dir '/user/horton/test_query' -m 1

Incremental import into HDFS

sqoop import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --target-dir '/user/horton/test_incremental' -m 1 --check-column ts --incremental lastmodified --last-value '2017-01-24 23:11:16.0' -merge-key row_key

Incremental import into Hive

sqoop import --hive-import --connect jdbc:mysql://127.0.0.1/export --username hive --password hive --driver com.mysql.jdbc.Driver --table exported --hive-table test.exported1 -m 1 --check-column ts --incremental lastmodified --last-value '2017-01-24 23:11:10.0' -merge-key row_key

Incremental import into HDFS from Query

sqoop import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --query 'select * from exported where $CONDITIONS' --target-dir '/user/horton/test_incremental/t2' -m 1 --check-column ts --incremental lastmodified --last-value '2017-01-24 23:11:16.0' -merge-key row_key

Incremental import into Hive from Query is not supported.

Sqoop Export
Insert/export from HDFS

sqoop export --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --export-dir /apps/hive/warehouse/test.db/exported1  -m 1 --input-fields-terminated-by '\001'

Insert/export from Hive

sqoop export --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --hcatalog-database test --hcatalog-table exported3 --input-fields-terminated-by '\001' -m 1

Update from HDFS

sqoop export --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --export-dir /apps/hive/warehouse/test.db/exported1  -m 1 --update-key row_key --input-fields-terminated-by '\001'

Using Hcatalog

sqoop export --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --hcatalog-database test --hcatalog-table exported -m 1

Update from Hive using Hcatalog is not supported yet. Getting Java exception and there are open tickets on it.

No comments:

Post a Comment

 A good reference for Shell scripting  https://linuxcommand.org/lc3_writing_shell_scripts.php