Since the Google App Engine Datastore is based on Bigtable and we know that’s not a relational database, how do you design a database schema/data model for applications that use this type of database system?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
No related questions found
Designing a bigtable schema is an open process, and basically requires you to think about:
GAE’s datastore automatically denormalizes your data. That is, each index contains a (mostly) complete copy of the data, and thus every index adds significantly to time taken to perform a write, and the storage space used.
If this were not the case, designing a Datastore schema would be a lot more work: You would have to think carefully about the primary key for each type, and consider the effect of your decision on the locality of data. For example, when rendering a blog post you would probably need to display the comments to go along with it, so each comment’s key would probably begin with the associated post’s key.
With Datastore, this is not such a big deal: The query you use will look something like ‘Select * FROM Comment WHERE post_id = N.’ (If you want to page the comments, you would also have a limit clause, and a possible suffix of ‘ AND comment_id > last_comment_id’.) Once you add such a query, Datastore will build the index for you, and your reads will be magically fast.
Something to keep in mind is that each additional index creates some additional cost: it is best if you can use as few access patterns as possible, since it will reduce the number of indices GAE will construct, and thus the total storage required by your data.
Reading over this answer, I find it a little vague. Maybe a hands-on design question would help to scope this down? 🙂