I think I understand sharding to be putting back your sliced up data (the shards) into an easy to deal with aggregate that makes sense in the context. Is this correct?
Update: I guess I am struggling here. In my opinion the application tier should have no business determining where data should be stored. At best it should be shard client of some sort. Both responses answered the what but not the why is it important aspect. What implications does it have outside of the obvious performance gains? Are these gains sufficient to offset the MVC violation? Is sharding mostly important in very large scale applications or does it apply to smaller scale ones?
Sharding is just another name for “horizontal partitioning” of a database. You might want to search for that term to get it clearer.
From Wikipedia:
Some more information about sharding:
Update: You wont break MVC. The work of determining the correct shard where to store the data would be transparently done by your data access layer. There you would have to determine the correct shard based on the criteria which you used to shard your database. (As you have to manually shard the database into some different shards based on some concrete aspects of your application.) Then you have to take care when loading and storing the data from/into the database to use the correct shard.
Maybe this example with Java code makes it somewhat clearer (it’s about the Hibernate Shards project), how this would work in a real world scenario.
To address the “
why sharding“: It’s mainly only for very large scale applications, with lots of data. First, it helps minimizing response times for database queries. Second, you can use more cheaper, “lower-end” machines to host your data on, instead of one big server, which might not suffice anymore.