I’m converting a script to HDFS (Hadoop) and I have this cmd:
tail -n+$indexedPlus1 $seedsDir/*url* | head -n$it_size > $it_seedsDir/urls
With HDFS I need to get the file using -get and this works.
bin/hadoop dfs -get $seedsDir/*url* .
However I don’t know what downloaded file name is, let alone that I wanted to store in $local_seedsDir/url.
Can I know?
KISS tells me:
bin/hadoop dfs -get $seedsDir/*url* $local_seedsDir/urls
i.e. just name the file as urls locally.
then tail and head to extract from url the actual file name and store it in $urls
But otherwise, just KISS