I have a java application which needs to read and write files to HDFS. I do use
FileSystem fs = FileSystem.get(configuration);
And it works well.
Now the question is : should I keep this reference and use it as a singleton or should I use it only once and get a new one each time?
If it matters, I need to say that the application targets a quite high traffic.
Thanks
I think the answer depends on relation of two numbers – network bandwidth (between HDFS client and HDFS cluster) and amount of data per second you can feed to HDFS client. If first is higher – then having a few connections in the same time makes sense.
Usually 2-3 concurrent connections are optimal