Bill Karwin has a blog post called “Why Should You Use An ORM?” which is being discussed on Reddit and I was confused about a couple of points.
In it he says in the comments section:
OODBMS and ORM works only on objects
that we’ve instantiated in the
application layer. I.e. there’s no way
to do a query like this:UPDATE Bugs SET status = ‘CLOSED’
WHERE status = ‘OPEN’;To do this in an ORM or an OODBMS, you’d have to fetch all bugs that
match the criteria and instantiate
objects for them. Then you could set
the attribute and save the objects
back to the database one by one. This
is expensive and certainly requires
more code than the equivalent SQL
operation shown above.This illustrates an advantage of a
language like SQL that treats sets as
a first-class data type. The OO
paradigm cannot substitute for the
relational paradigm in all cases.
There are some ordinary operations
that SQL can do much better.
I bolded the part where he says you have to instantiate objects for these bugs when you use an ORM because that’s the part I’m confused about.
My question is why do you have to? Okay, object-oriented is one thing and relational is another. But is it really true that they are so different that there is no way to represent an object so that it can be understood by the relational database? For example, I’m thinking about how you can serialize an object and then it gets written into a file-storable format. Couldn’t you use a format like that to transfer the object to a relational database?
You’ve missed the point of my statements. I didn’t mean that one couldn’t store an object in a relational database. I meant that the OO paradigm assumes you have an instance of that object in application space. That is, you can call methods and access properties of an object:
But in any ORM I’ve seen*, you can’t operate on an object instance without first fetching it from the database. Nor can you operate on whole sets of rows at a time, as you can with SQL.
It would be interesting to see an ORM package that had an object type mapping to a set of data. Then when you change an attribute, it applies to all rows in that set. I haven’t seen any ORM attempt to do this.
It would be very challenging, because of concurrency issues. Does the set include rows that were in that set when you instantiated the object, or when you execute the change, or when you save the changes? If it supports all these permutations as options, then it starts to get so complex to use that one might rightly think that it represents no actual improvement over using SQL directly.
Re your comment: It’s not that sets and objects are incompatible. A set can be an object (Java even has classes for Collection and Set). But the OO paradigm assumes operations apply to one object instance, whereas relational operators always apply to sets (a set of one row is still a set). And in reality, ORM packages that exist today make the same assumption, that one can change only one instance of a row at a time, and you must have fetched that row before you can change it.
It’s possibly in theory to expand the capabilities of an ORM to work on sets — but AFAIK no one has tried to do this. My claim is that an ORM that could do all of what relational operators can do would be much worse to use than SQL.
* I am excluding SQL-like pseudolanguages like HQL, that happen to be part of an ORM package (Hibernate in the case of HQL) but that pseudolanguage itself doesn’t qualify as an ORM.