I am having the following situation:
- about 10 threads which crawl the web for images
- all found images must somehow be returned to 10 other threads (for analyzing)
As said, I want to process the images at the same time with the 10 other threads.
Currently I have this Singleton implementation of an own list:
public class ImageList extends Observable implements Iterable<Image> {
private final BlockingQueue<Image> images = new LinkedBlockingQueue<Image>();
private static class InstanceHolder {
public static ImageList instance = new ImageList();
}
public static ImageList getInstance() {
return InstanceHolder.instance;
}
private ImageList() {
}
public synchronized void execute(Image job) throws InterruptedException {
images.put(job);
new Thread(job).start();
System.out.println("notify observers");
this.setChanged();
this.notifyObservers();
System.out.println(this.countObservers());
}
@Override
public Iterator<Image> iterator() {
return images.iterator();
}
}
And as soon as an image is found, I execute ImageList.execute(image), but I do not like this solution, because there is no upper bound to the parallel processes (it might become thousands).
Another idea I had:
- pass an additional list
imagesFoundto all my crawlers, let them add all images into that list - start 5 threads in the Main class which constantly check for new elements in
imagesFoundand process them
However, I do not like this solution either, because passing an array which is not really needed by the thread (but just used to pass back found data) seems wrong to me. It might become 20 different lists if I want to search for 20 different informations on a website.
So, how do you usually implement the return of data from threads (in my case: especially if this data itself shall be processed by other threads).
Perhaps a thread pool? Check out ExecutorService.
Example:
…