What will be the best approach for creating MongDB collection(s) that can be scalable and have best read performance? Following are the assumption
- A user has 100 entries /day. Entries are private to user.
- We may have 200,000 users. So almost 200 * 200,000 = 20M entries a day.
- User likes to view the entries as soon it is inserted.
- User likes to search their own entries even data is 3 months old. In 3 months, 20M* 90
= 180M entries. - There are no updates. Only insert and delete.
Option in our mind.
- Sharding based on user name. A .. D in one shard etc. But still it will be very difficult to scale.
- Create one collection for each user. We know it is drastic approach but why not. We are not doing aggregation across user data. Any limitation of number of collection in MongoDB
Any suggestion will be appreciated.
Thanks.
One collection per user will not work, unfortunately, due to the limits on the number of namespaces you can have (24,000).
I think there are a few good directions to go. You are certainly going to want to use a shard key that distributes uniformly – username would be good. What are your concerns about its scalability?
You may want to check out TTL (Time to Live) collections, as well as Read preference to let your application read from secondaries. This can speed up query times by distributing workload.