I am basically trying to write a multicore version of mapreduce just to see

Question

0

Asked: June 1, 20262026-06-01T16:57:57+00:00 2026-06-01T16:57:57+00:00

I am basically trying to write a multicore version of mapreduce just to see

0

I am basically trying to write a multicore version of mapreduce just to see whether i got the concept or not. And also wanted to learn threading in python as well.

I have lets say two chunks of text string.

How do I process them (let say tokenize them to words) simultaneously using multi-threads.
I thought I understood the docs, but this is one part (multithreading program) which one has to be very careful if it has to be efficient.
Any suggestions?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T16:57:58+00:00

I suggest you try using the multiprocessing module, and use its map() method. This will let you use multiple cores efficiently.

Python threading is not as efficient as it could be because of time-consuming locking within the Python interpreter. There is a threading module but you are probably better off with the multiprocessing module for map/reduce sort of problems.

Also, if you want to make sure you understand map/reduce, why not play with a real map/reduce system? Hadoop is an available free-software map/reduce system and it is possible to use Python with Hadoop:

http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am basically trying to write a multicore version of mapreduce just to see

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply