The ever so popular discussion on designing proper DAOs always concludes with something along the lines of “DAOs should only perform simple CRUD operations”.
So what’s the best place to perform things like aggregations and such? And should DAOs return complex object graphs resembling your data source’s schema?
Assume I have the following DAO interface:
public interface UserDao {
public User getByName(String name);
}
And here are the Objects it returns:
public class Transaction {
public int amount;
public Date transactionDate;
}
public class User {
public String name;
public Transaction[] transactions;
}
First of all, I consider the DAO to be returning a standard Value Object if all it does is CRUD operations.
So now I have modeled by DAO to return something based on a data store relationship. Is this correct? What if I have a more complex object graph?
Update: I guess what I am asking in this part is, should the return value of a DAO, be it VO, DTO, or whatever you want to call it, be modeled after the data store’s representation of the data? Or should I, say introduce a new DAO to get a user’s transactions and for each user pulled by the UserDAO, invoke a call to the TransactionDAO to get them?
Secondly, let’s say I want to perform an aggregation for all of a user’s transactions. Using this DAO, I can simply get a user, and in my Service loop though the transactions array and perform the aggregation myself. After all, it’s perfectly reasonable to say that such an aggregation is a business rule that belong in the Service.
But what if a user’s transactions number in the tens of thousands. That would have a negative impact on application performance. Would it be incorrect to introduce a new method on the DAO that does said aggregation?
Of course this might be making an assumption that the DAO is backed up by a database where I can write a simple SELECT SUM() query. And if the DAO implementation changes to say a flat file or something, I would need to do the aggregation in memory anyway.
So what’s the best practice here?
I use the DAO as the translation layer: read the db objects, create the java side business objects and vice versa. Sometimes a couple of calls might be used to store or create a business object. For the provided example, I would make two calls: one for the user info, one for the list of the user’s transactions. The cost is an extra database call. I’m not afraid to make an extra call if I’m using connection pooling and I’m not repeating calculations. Separate calls are simpler to use (unpacking an array of composite types from a jdbc call is not simple and typically requires the proprietary connection object) and provide resusable components. Let’s say you wanted the user object for a login screen: you can use the same user dao and not have to pull in the transaction stuff.
If you didn’t actually want the transaction details but were just interested in the aggregate, I would do the aggregate work on the database side and expose it via a view or a stored procedure. Relational databases are built for and excel at these kinds of set operations. You are unlikely to perform the operations better. Also, there is no point sending all the data over the wire if the result will do. So sure, add another dao for the aggregate if there are times you are only interested in that.
Is it safe to assume the dao maps to a relational db? If that is how you are starting, I would wager that the backing datastore will remain a relational db. Sometimes there is a lot of fuss and worry to keep it generic, and if you can, great. But it seems to me just changing the type of relational db in the back is further than most apps would go (let alone changing to a non-relational store like a flat file).