I read on an article called “Hands-on Cassandra” that Tokyo Cabinet is not good for big data. Why? How many bytes TC needs to store before start to work bad? Is is possible to determine a approximated value?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Based on this article, there’s a confirmed performance degradation past 500GB.
Based on this wide comparison of NoSQL databases, the problems in TC start at >20mm rows.
Among the possible causes of size dependency is the fact that it seems TC is implemented using hashes, and at some point you run into hash key collisions which of course ruins the performance. By default, key space is not as large as can be (you need to tune “bnum” parameter – number of elements of the bucket array – to increase performance)
Based on various comparisons, MongoDB seems to be the recommended approach for large datasets.