Jakob Østergaard presented this challenge:
Write a program that reads text from standard-input, and returns (prints) the total number of distinct words found in the text.
How can we meet this challenge with parallel programming (preferably in Go, but a description in English will suffice)?
There are several possibilities, but I guess you mean “efficiently” ?
The general idea would be to split the text into manageable chunks, pile those chunks into a queue, and have multiple consumers handle the chunks.
This looks like a typical Map/Reduce application to me:
Ideally the “multiple” queues would be a single one with multiple consumers, so that if one worker slows down the whole process does not slows as much.
I would also use a signal from the Splitter to the Workers to let them know the input has been fully consumed and they can start send their results to the Aggregator.