I’m totally new to the jsr166y library and I’ve written a routine using the forkjoin library that splits up a query and runs it against database replicas concurrently. I’ve put a snippet below. The SelectTask extends RecursiveTask.
ForkJoinExecutor fjPool;
Future queryResultsFut = null;
for (int i = 1; i <= lastBatchNum; i++) {
…
SelectTask selectMatchesRecursiveTask = new SelectMatchesTask(loadBalancer.getDao(), thisRuleBatch, queryResults);
queryResultsFut = fjPool.submit(selectMatchesRecursiveTask);
}
queryResultsFut.get();
The call to the get method is intended to block the parent thread until all query results are returned so that processing can commence on the aggregated results.
I’ve discovered now after some time running in a CI environment that this does not always happen. When there is a slow database then then thread will continue even if tasks are still running. This seems to me to contradict the documentation I read.
Perhaps I am going about this the wrong way? Should I extend ForkJoinTask instead of RecursiveTask?
You probably shouldn’t be using ForkJoin for this at all. The FJ framework is specifically designed for CPU intensive non-blocking task parallelism, but you are specifically using it for blocking tasks (external db queries). I would suggest you use the normal executor framework for what you are trying to do.
The only aspect of FJ that matches your problem is the task decomposition. This though isn’t going to be too difficult to roll by hand, either by a simple n-way division or by a more sophisticated recursive strategy.