I am creating a mongoDB/nodejs based CMS and I am using GridFS to store all the uploaded docs. The question I have is this:
Does MongoDB replication sets allow increased amount of DB Storage, or
simply duplicates of the database. For Instance, if I have 5 servers
with 1TB of storage each, if I replica mongo across all of them, would
my GridFS system have theoretically 5TB of storage (minus caching and
padding) or 1TB of storage duplicated several times for better read
performance?
Thanks!
Informal description:
Replication = The same copy of the data on multiple nodes, i.e., 5 nodes with 1TB each provide 1TB overall.
Sharding / Partitioning = Fraction of the data goes to the nodes, i.e., 5 nodes with 1TB each provide 5TB overall.
Each approach has certain advantages and disadvantages, e.g., replication can help with read throughput and is good as backup, but slows down inserts (depending on commit level), whereas partitioning can help with insert throughput and distributed lookups.
Again, details left to the storage system implementor.