Happy Learning: June 2016

Sqoop Import

From MYSQL table to HDFS

sqoop import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --target-dir '/user/horton/exported' -m 1

From MYSQL table to Hive

sqoop import --hive-import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --hive-table test.exported3 -m 1

From MYSQL table to Hive schema defined.

sqoop import --hive-import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --hive-table test.exported3 -m 1 --map-column-hive 'row_key=int,value=string,ts=timestamp';

From MYSQL table to Hive with delimiters

sqoop import --hive-import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --hive-table test.exported2 -m 1 --fields-terminated-by ','

Free form query import

sqoop import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --query 'select * from exported where $CONDITIONS' --target-dir '/user/horton/test_query' -m 1

Incremental import into HDFS

sqoop import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --target-dir '/user/horton/test_incremental' -m 1 --check-column ts --incremental lastmodified --last-value '2017-01-24 23:11:16.0' -merge-key row_key

Incremental import into Hive

sqoop import --hive-import --connect jdbc:mysql://127.0.0.1/export --username hive --password hive --driver com.mysql.jdbc.Driver --table exported --hive-table test.exported1 -m 1 --check-column ts --incremental lastmodified --last-value '2017-01-24 23:11:10.0' -merge-key row_key

Incremental import into HDFS from Query

sqoop import --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --query 'select * from exported where $CONDITIONS' --target-dir '/user/horton/test_incremental/t2' -m 1 --check-column ts --incremental lastmodified --last-value '2017-01-24 23:11:16.0' -merge-key row_key

Incremental import into Hive from Query is not supported.

Sqoop Export

Insert/export from HDFS

sqoop export --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --export-dir /apps/hive/warehouse/test.db/exported1 -m 1 --input-fields-terminated-by '\001'

Insert/export from Hive

sqoop export --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --hcatalog-database test --hcatalog-table exported3 --input-fields-terminated-by '\001' -m 1

Update from HDFS

sqoop export --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --export-dir /apps/hive/warehouse/test.db/exported1 -m 1 --update-key row_key --input-fields-terminated-by '\001'

Using Hcatalog

sqoop export --connect jdbc:mysql://127.0.0.1/export --username root --driver com.mysql.jdbc.Driver --table exported --hcatalog-database test --hcatalog-table exported -m 1

Update from Hive using Hcatalog is not supported yet. Getting Java exception and there are open tickets on it.

Happy Learning

Thursday 30 June 2016

Handling skew data in Hive

Sunday 26 June 2016

Sqooping from/to MYSQL and HDFS