I’m loading a lot of objects with dicts containing large string values. Overall, the program exceeds 2GB and crashes. It doesn’t exceed by much, but I could well have even larger data later.
It seems Python 32bit is unable to access more memory. I suppose for the future I need some object database system which is able to handle large data and still not be too slow (i.e. store in DB or harddrive, but keep some in memory for speed). For performance I don’t want to keep the data in MySQL only but rather have some transparent mechanism which keeps as much as possible in memory.
Can you think of a good way to deal with so much data in objects?
Depending on how complex is your data structure, take a look at these:
memcached
Key-value store, damn fast (‘O(1) everything’), scales to many machines, intended for caching (not persistent). There are solutions to persist and load the data, and even memcachedb.
mongoDB
JSON store, can have indexes other than primary key, scales to many machines, has auto-sharding and auto-failover, persistent. Supports very fast inserts, atomic ops, a sort of built-in map-reduce for complex queries.
redis
Key-value store, values can be structured. Has many advanced operations, atomic ops, pub/sub, master-slave replication. Operates entirely in RAM but has limited persistence mechanisms.
Consider re-formulating your question’s title, something like “What in-memory database to choose” would be more informative.