I’m trying to process a certain FTP directory which holds several directories and in turn those directories have an arbitrary number of files. So what I’m trying to do is to have 1 thread for each of the subdirectories and each thread to be concerned with the respective sub-dir here is what I came up:
private void fetchFilesFromFTP() {
try {
client.connect("ftp.ncbi.nih.gov");
client.login("anonymous", "anonymous");
client.changeWorkingDirectory("genomes/Fungi");
FTPFile dirs[] = client.listDirectories();
dirsToDl.addAndGet(dirs.length);
for (final FTPFile ftpFile : dirs) {
exec.execute(new Runnable() {
//process each FTP directory in a new thread
@Override
public void run() {
processFTPdir(ftpFile.getName());
}
});
}
} catch (SocketException ex) {
Logger.getLogger(FungiProcessor.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(FungiProcessor.class.getName()).log(Level.SEVERE, null, ex);
}
}
private void processFTPdir(String dir) {
File f = new File(destination + File.separator + dir);
if (!f.mkdirs()) {
System.out.println("Error creating dir for " + dir);
return;
}
FTPFile files[];
try {
//we are already in the correct directory
files = client.listFiles(dir, new FTPFileFilter() {
@Override
public boolean accept(FTPFile ftpf) {
return ftpf.getName().endsWith(".gbk");
}
});
for (FTPFile fTPFile : files) {
FileOutputStream fout = new FileOutputStream(destination + File.separator + dir + File.separator + fTPFile.getName());
if (client.retrieveFile(dir + "/" + fTPFile.getName(), fout)) {
System.out.println("successfully downloaded");
fout.flush();
fout.close();
}
System.out.println(client.getReplyString());
}
} catch (IOException ex) {
Logger.getLogger(FungiProcessor.class.getName()).log(Level.SEVERE, null, ex);
} finally {
if(dirsToDl.decrementAndGet() == 0) latch.countDown();
}
}
The sequential version of the code works – I can see files ending with .gbk actually being downloaded whereas with the multithreaded version only the respective subdirectories are created but no files are downloaded. I don’t even get any errors. Is is possible that FTP doesn’t support multiple file downloads at once?
A better way is to have one tread with its own client connect to the server and create a list of paths of the files you want to download. Then you start up a number of threads, each with its own client, that grabs the first from the list and starts downloading the file.
The number of concurrent connections that you can have to one ftp-server is limited by the settings of that server.