I’m trying to get some final result files from HDFS to S3. I want to use DistCp, but that only copies entire folders it seems, and I only want to copy some of the files in a folder.
So I figure I need to move the files I want to a folder of their own then upload the folder with DistCp. I understand that I should use FileSystem.rename(path1,path2) to do that.
So I’m trying this little test of 1 file from java:
Path itemsTable = new Path("hdfs://localhost/process-changes/itemstable-*");
itemsTable.getFileSystem(getConf()).mkdirs(new Path("hdfs://localhost/output"));
//Simple test moving just 1 file around HDFS via java API
boolean success = itemsTable.getFileSystem(getConf()).rename(new Path("hdfs://localhost/process-changes/itemtable-r-00001"), new Path("hdfs://localhost/output/itemtable-r-00001"));
But I always get false back from the rename(…) method.
Is this even the right way to do this? If so, any guess as to what I’m doing wrong?
Well, in the end this did work. I’m not quite sure why I was getting errors, I believe it was a serious of small mistakes. The code above in general should work (if you’re reading this with the same problem as me). Have faith and just work through the minor issues.