I’ve been reading a lot lately about how joins in DB queries slow things down. Evidently Google App Engine doesn’t even allow them.
I’m wondering how people design an app with no joins though. For example I’m working on an app that has contacts and organizations. A contact can be in many organizations and an organization can have many contacts. How would it be possible to have that relationship without a third table that connects the two entities…
contacts --< contacts_organizations >-- organizations
Does it mean that in GAE you can’t have a many-to-many relationship? You just leave out functionality that would require a join?
I guess you could have a TEXT organizations column in the contacts table containing a space-separated list of the organization IDs for each contact. That seems a little weird though.
Usually when you are talking about databases not allowing joins, you are talking about very large databases that don’t necessarily fit on one server. The recent examples being the cloud databases like Amazon’s SimpleDB, Microsoft’s SQL Data Services, and Google’s App Engine Datastore. Some offer limited join capability, but the big difficulty is doing joins across ‘partitions‘. In large databases like this, you partition your data so it doesn’t have to reside on the same server. You have to decide the right way to partition it.
In your example, I would store a list of organization keys in a field in the contacts table, and vice versa. The design of these databases is different than your typical normalized database. The tables are usually ‘sparse tables’, which basically means each record can have any number of fields which are basically name/value pairs. Think of a products table on Amazon, and how many different fields there could be for different types of products. Books have number of pages, but MP3s have duration. In a sparse table, these records would be stored in the same table.