I’d like to setup an experiment to evaluate how Mongo performs using various technologies capable of Snapshots.
- R1Soft HotCopy on ext3
- R1Soft HotCopy on xfs
- LVM with ext3
- LVM with xfs
- btrfs
It needs to be disk IO bound and so I need to ensure all my writes are synchronous in nature – otherwise I will need to create a dataset that will breach RAM and Swap constraints, but I believe enabling a filesystem flush on every insert will ensure that each operation is flushed before the next.
> db.runCommand({getlasterror:1,j:true})
What else could I do to really exercise the IO nature of a MongoDB process?
- I could interleave reads and writes.
I will test something like constant insertion rate and observe how the process behaves during the following periods
- No snapshot-related activity or presence.
- When the snapshot is being taken and committed.
- When the snapshot is being read by a backup script.
- When the snapshot is redundant but active.
- When the snapshot is being decommissioned.
I’m looking to ensure that whilst the activity and hardware it kept constant, a relative benchmark of performance is encountered.
Thanks for any tips.
Rob, this is great. The results of this should benefit everyone.
I wanted to note some things that might be helpful during your testing typical snapshot operations for production deployments of MongoDB.
Taking Snapshots
The main problem with taking a snapshot on a live server, as you have pointed out, is IO contention. To avoid this, most will deploy a replica set with 3+ members.
Often, in this scenario, one of the secondaries is either fsync and locked during a snapshot, configured as a hidden member, or simply taken offline. This allows for a snapshot to be taken while a hot backup (the other secondary) is still available for automatic failover.
This also ensures two other things. First, that a snapshot can be done in a timely fashion (production load not affecting backup time) and second, that the load required to take the snapshot does not affect production reads (in the case where reading from secondaries is allowed — slaveOk).
Backup Mechanics
The above point about snapshot strategy is important because most people overlook the fact that secondaries have the same write load as a primary.
MongoDB does not have multi-master replication. Only one server (in the set/shard) is active for writes (the primary, or master) at a given time.
However, while the primary is receiving writes and reads, the secondaries tail the oplog (a capped collection) of that primary. The secondaries issue periodic requests to see if there is more data waiting to be read from this tailable cursor. When there is, those oplog entries are read from the primary and written (yes — this takes a write lock) to the secondary.
The secondaries then apply the new entries in the oplog to their own copy of the data (yes — this takes a write lock).
You probably know all this, but just keep it in mind when doing this awesome research.
Good luck!