I am working on a research project in big data mining. I have written

Question

0

Asked: June 7, 20262026-06-07T07:44:12+00:00 2026-06-07T07:44:12+00:00

I am working on a research project in big data mining. I have written

0

I am working on a research project in big data mining. I have written the code currently to organize the data I have into a dictionary. However, The amount of data is so huge that while forming the dictionary, my computer runs out of memory. I need to periodically write my dictionary to main memory and create multiple dictionaries this way. I then need to compare the resulting multiple dictionaries, update the keys and values accordingly and store the whole thing in one big dictionary on disk. Any idea how I can do this in python? I need an api that can quickly write a dict to disk and then compare 2 dicts and update keys. I can actually write the code to compare 2 dicts, that’s not a problem but I need to do it without running out of memory..

My dict looks like this:
“orange” : [“It is a fruit”,”It is very tasty”,…]

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-07T07:44:15+00:00

Agree with Hoffman: go for a relational database. Data-processing is a bit of an unusual task for a relational engine, but believe, it is a good compromise between easy of use/deployment and speed for large datasets.

I customarily use sqlite3, that comes just with Python, although more often I use it through apsw. The advantage of a relational engine like sqlite3 is that you can instruct it to do a lot of processing with your data through joins and updates, and it will take care of all the memory/disk swapping of data required, in quite a sensible manner. You can also use in-memory databases to hold small data which you need interacting with your big data, and have them linked through “ATTACH” statements. I have processed gigabytes this way.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am working on a research project in big data mining. I have written

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply