I’m using DDD for a service-oriented application intended to transmit a high volume of messages between a high volume of web clients (i.e., browsers).
Because in the context of required functionality, the need for transmission outweighs the need for storage, I love the idea of relying on RAM primarily and minimizing use of the database.
However I’m unclear on how to architect this from a scalability point of view. A web farm creates high availability of service endpoints and domain logic processing. But no matter how many servers I have, it seems they must all share a common repository so that their data is consistent.
How do I build this repository so that it’s as scalable as possible? How can it be splashed across an array of physical machines in a manner such that all machines are consistent and each couldn’t care less if another goes down?
Also since touching the database will be required occasionally (e.g., when a client goes missing and messages intended for it must be stored until it returns), how should I organize my memory-based code and data access layer? Are they both considered “the repository”?
There are several ways to solve this issue. No single answer can really cover it all…
One method to ensure your scalability is to simply scale the hardware. Write your web services to be stateless so that you can run a web farm (all running the same identical services, pointing to the same DB) and turn your DB into a cluster. Clustered databases run over multiple servers and work on the same storage. However, this scenario can get complicated and expensive quite quickly.
Some interesting links:
Another method is to look at architecture. CQRS is a common architectural model that ensures scalability. For instance, this architecture model — its name stands for Command/Query Responsibility Segregation — builds different databases for reading and writing. This seems contradictory, but if you study it, it becomes natural and you wonder why you’ve never thought of it before. Simply put, most apps do a lot more reading than writing, and writing tends to be a lot more complicated than reading (requiring business rule validation etc.), so why not separate the two? You can use your expensive transactional database for writing and then your cheap, maybe Non-SQL based or open source, database over multiple reading servers. Your read model is then optimized for the screens of your application(s), whereas the write model is optimized solely for writing and is in fact a DDD-based set of repositories.
There’s just not enough room here to cover this option in detail, but CQRS is a good way of achieving scalability and even ease of development, once you have a CQRS framework in place. There are many other advantages to CQRS, such as ease of auditing (if you combine it with the “event sourcing” technique, which is common in CQRS-based environments).
Some interesting links: