I’m trying to figure out how to link/reference a document to another document but I’m not finding much information or examples in the docs or other sources. When linking documents, do I have to link by ObjectID or can I use any field? Do I need to pull the field value directly from the originating document or can I pass the same value from anywhere? For example, given a hex string of a UUID object, I want to link 2 documents via a field ‘GUID’ which contains the uuid1 object:
# What is more efficience/the correct way, option 1 or 2?
# Option 1
hexString = '5d78ad35ea5f11e1a183705681b29c47'
newLinkField = { 'linkToSong' : uuid.UUID( hexString ) }
db.artists.update( { 'name' : 'Bob Dylan' }, { $set : newLinkField }, upsert = False)
# Option 2
hexString = '5d78ad35ea5f11e1a183705681b29c47'
songGUID = db.songs.find_one({ 'GUID' : uuid.UUID( hexString ) }, {'GUID': 1 });
newLinkField = { 'linkToSong' : songGUID }
db.artists.update( { 'name' : 'Bob Dylan' }, { $set : newLinkField }, upsert = False)
Also, is this storing actual links or just duplicates of the UUID object?
I highly recommend this 10gen video for understanding linking and embedding, how to use them, and what the tradeoffs are:
http://www.10gen.com/presentations/mongosv-2011/schema-design-principles-and-practice
To answer your questions: “linking” a document A to a document B simply means putting some information in A that allows your application to query for B. Typically, it’d be something like this:
Document A, my comment, is linked to B, my user profile. Your application can query for A however it wants to (by _id, by user, by text, …) and then examine its ‘user’ field, then find my user profile by querying the users collection for that ObjectId.
The field you use to link A to B should probably be unique in B’s collection, and it should definitely be indexed in B’s collection.
Both requirements are always satisfied by any collection’s _id field, but they could be satisfied by some other field as well.
Your example 1 is fine. Although you probably have it backwards: songs should have artist ids in them, rather than vice-versa, unless Bob Dylan has only ever written one song.
In example 2, it is unnecessary and costly to find the song before the artist, if you already know how you’re going to query for the artist document.