I am doing coding to solve a problem like this.
There are about 50 – 200 small files. And total size of these files will not be very big. And my goal is concurrently loading them into one table.
And it seems the code works quite well and it really works faster than using single thread.
However, I just a bit not sure. Is it the best choice to open as much as threads as the file numbers?
Another thing is I just use the simplest way to do the multi-thread.
Is this also the best choice?
Any suggestion or advice is appreciated!
for (int i = 0; i < threads.length; ++i) {
synchronized (threads[i]) {
threads[i].run();
}
}
for (int i = 0; i < threads.length; ++i) {
synchronized (threads[i]) {
try {
threads[i].join();
} catch (InterruptedException e1) {
return;
}
}
}
First thing: Threads are started with
start(), notrun().run()will make them run serially on the main thread.Second thing: It’s pointless to start 200 threads. You are only wasting time and system resources without getting any speedup (you cannot be faster by a factor larger than the number of processor cores). Just use an ExecutorService and queue up tasks using
submit. This managed thread pool will take care of selecting the correct number of threads depending on your system resources.Third thing: If you are just processing some files independently, you do not need any locks. Especially in your example you are using locks to start and join threads, which has no (positive) effect.
Edit: Probably the best solution in your case is to just implement a single-producer (to read from disk), multiple-consumer (to process the files) system.