We’re using solr 3.6 replication with 2 servers – a master and a slave – and we’re currently looking for the way to do clean backups.
As the wiki says so, we can use a HTTP command to create a snapshot of the master like this: http://myMasterHost/solr/replication?command=backup
But we still have some questions:
-
What is the benefit of the
backupcommand on a classic shell script copying the index files? -
The command only backups the indexes; is it possible to copy also the
spellcheckerfolder? is it needed? -
Can we create the snapshot while the application is running, so while there are potential index updates?
- When we have to restore the servers from the backup, what do we have to do on the slave?
- just copy the snapshot in its index folder, and removing the
replication.propertiesfile (or not)? - ask for a fetchindex through the HTTP command
http://mySlave/solr/replication?command=fetchindex? - just empty the slave index folder, in order to force a full replication from the master?
- just copy the snapshot in its index folder, and removing the
You can use the
backupcommand provided by the ReplicationHandler. It’s an asynchronous operation and it takes time if your index is big. This way you don’t need to shutdown Solr. Then you’ll find within the index directory a new directory namedbackup.yyyymmddHHMMSSwith the backup date. You can also configure how many old backups you want to keep.After that of course it’s better if you move the backup to a safe location, probably to a different server.
I don’t think it’s possible to backup the spellchecker, not completely sure though.
Of course the command is meant to be run while the application is running. The only problem is that you will probably lose in the backup the documents that you committed after you started the backup itself.
You can also have a look at the lucene CheckIndex tool. Once you backed up the index you could check if the index is ok.
I wouldn’t personally use the backups to restore the index on the slaves if you already have a good index on the master. The copy of the index would be automatic using the standard replication process (it’s really a copy of the index segments), you don’t need to copy them manually unless the backup contains better data than the master.