Wherever other questions on SO and other sites ask “How do I write concurrent code?“, the answers always involves pretty vague explanations such as “check for data dependencies” or “interdependencies” inside the code. I’m wondering what these mysterious dependencies actually look like as Java code!?!
- What’s a concrete example of a section of code that can be easily parallelized because it one part doesn’t have a dependency on another part?
- What’s a concrete example of a section of code that must be serial because of the existence of these dependencies?
- How do the existence of these dependencies weigh-in with the decision as to use a thread pool or not?
I guess I’m just not seeing the “forest through the trees” here. Thanks in advance!
A concurrent application is build on the execution of
tasks. You can identify ataskas any discrete unit of work. If you can identify parts of your code that stands out as a discrete unit of work and has explicit task boundaries, you can createRunnables for each task and run it in a separateThreadachieving parallelism. Actually aRunnablerepresents theTaskabstraction.An example would be a process that needs to load multiple files for processing. The loading of files is a discrete unit of work and can be done in the background by threads without blocking the code flow
If
Task 2depends on the calculation/processing ofTask 1thenTask 2must wait forTask 1to complete, thereby serializing the sequence.An example is to load a file in the background and then search for a key in the file
They don’t. Creating a
Threadis an expensive operation andThread Poolsare a construct design in reusing threads so as to avoid the overhead of creating new ones. In theThread Poolyou just passTasksasRunnables and theThread Poolis responsible to assign theTaskto a thread or create a new thread if needed. You could also define policies via thread pools etc. But they are a tool for concurrent programs. You are expected to have already identified the tasks to submit to the thread pool for execution.