I am trying to export from a partitioned hive table into mysql using sqoop.
At first I tried doing
$sqoop export --connect jdbc:mysql://<server addr>/<db name> --username <user name> -P --table source_edge_daily --export-dir /path/to/table/<table name> --input-fields-terminated-by '\t' --verbose
the command errors out saying
Open failed for file /path/to/table/<table name>/<partition name>, attempt to open a directory
when I point to the partition directory
$sqoop export --connect jdbc:mysql://<server addr>/<db name> --username <user name> -P --table source_edge_daily --export-dir /path/to/table/<table name>/<partition name> --input-fields-terminated-by '\t' --verbose
the command fails saying
at com.cloudera.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:100)
at com.cloudera.sqoop.mapreduce.CombineShimRecordReader.getCurrentKey(CombineShimRecordReader.java:43)
at org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.getCurrentKey(CombineFileRecordReader.java:75)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.getCurrentKey(MapTask.java:452)
at org.apache.hadoop.mapreduce.MapContext.getCurrentKey(MapContext.java:57)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:189)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:668)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1109)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
I also tried re-creating the mysql table without the partition keys, creating a partitioned mysql table, everything fails with the same error message
Sqoop currently doesn’t support loading partitioned tables and is still work in progress. So your first solution won’t work until this is solved.
The problem with specifying your partition directory directly is that you will lose the information about your partition, so you would need to create a temporary MySQL table which won’t contain the partition columns, then you can load data to this table easily. And finally you will just need to insert into your true table from this temporary table.