This question is intended to be software / platform agnostic. I am just looking for generic SQL code.
Consider the following (very simple for example’s sake) tables:
Table: Authors id | name 1 | Tyson 2 | Gordon 3 | Tony etc Table: Books id | author | title 1 | 1 | Tyson's First Book 2 | 2 | Gordon's Book 3 | 1 | Tyson's Second Book 4 | 3 | Tony's Book etc Table: Stores id | name 1 | Books Overflow 2 | Books Exchange etc Table: Stores_Books id | store | book 1 | 1 | 1 2 | 2 | 4 3 | 1 | 3 4 | 2 | 2
As you can see, there is a one-to-many relationship between Books and Authors, and a many-to-many relationship between Books and Stores.
Question one: What is the best query to eager load one author and their books (and where the books are sold) into an object-oriented program where each row is representative of an object instance?
Question two: What is the best query to eager load the entire object tree into an object-oriented program where each row is representative of an object instance?
Both of these situations are easy to imagine with lazy loading. In either situation you would fetch the author with one query and then as soon as you need their books (and what stores the books are sold at) you would use another query to get that information.
Is lazy loading the best way to do this or should I use a join and parse the result when creating the object tree (in an attempt to eager load the data)? In this situation what would be the optimal join / target output from the database in order to make parsing as simple as possible?
As far as I can tell, with eager loading, I would need to manage a dictionary or index of some sort of all the objects while I am parsing the data. Is this actually the case or is there a better way?
That’s a tough question to answer. I’ve done this before by writing a query that returns everything as a flat table and then looping through the results, creating objects or structures as the most-significant columns change. I think that works better than multiple database calls because there’s a lot of overhead involved in each call, though depending on how many smaller entities there are to each big entity that might not be best.
The following might apply to both your questions 1 and 2.
(pseudocode, in your program that makes the db call)
[edit] I just realized I didn’t answer the second part of your question. Depending on the size of the data for stores and what I intended doing with it, I would either
Before looping through books/authors as above, slurp the whole stores table into a structure in my program, much like the book/author structure above but indexed by the storeid, and then do a lookup in that structure every time I read a book record and store a reference to the store table
or, if there are many stores,
Join the stores onto the books and have an additional nested loop to add stores objects within the part of the code that adds a book.
Here’s a relevant Wikipedia article: http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch
I hope that helps!