I’m just starting to build a Social Site into DynamoDB.
I will have a fair amount of data that relates to a user and I’m planning on putting this all into one table – eg:
- userid
- date of birth
- hair
- photos urls
- specifics
etc – there could potentially be a few hundred attributes.
Question:
- is there anything wrong with putting this amount of data into one table?
- how can I query that data (could I do a query like this “All members between this age, this color hair, this location, and logged on this time) – assuming all this data is contained in the table?
- if the contents of a table are long and I’m running queries on that table like above would the read IO’s cost be high – might be a lot of entries in the table in the long run…
Thanks
No. You can’t query DynamoDB this way. You can only query the primary key (and a single range optionally). Scanning the tables in DynamoDB is slow and costly and will cause your other queries to hung.
If you have a small number of attributes, you can easily create index tables for these attributes. But if you have more than a few, it becomes too complex.
Main Table:
Index Table for “hair”:
You can check out Amazon SimpleDB that is adding an index for the other attributes as well, therefore allowing such queries as you wanted. But it is limited in its scale and ability to support low latency.
You might also consider a combination of several data stores and tables as your requirements are different between your real time and reporting: