Backing up Fusion/Solr and Cloud

There are a few things to consider in this discussion - Zookeeper configurations, Zookeeper itself, your indexes and for Fusion - your data directory. 

1) Backing up zk configurations: This is useful if somehow you get into some sort of weird zookeeper corrupted state. That can be done with the script: You export it to a file and you can back the file up and reimport it again if necessary. Since there is cluster information included in fusion and solr zookeeper configurations, you want to be very careful about the idea of using this configuration in another zookeeper setup in another cluster. In fact, you don't want to do that, unless you are very careful about what exactly you are backing up and restoring. You could for example, take the /configs node to a separate cluster fairly safely as long as you don't reference other machines or something in your Solr configs. 

2) Backing up zookeeper itself: You can just copy the files over from the version-2 data directory itself. This will restore zookeeper state at that point in time. You want to be careful here of making sure that if you were to restore zookeeper data from this state that you want to turn Solr off and on surrounding the restore. That is because there are ephemeral nodes attached to the sessions that Solr has. I've tested the fact that these nodes will be killed when the sessions don't show up anyway, but Solr will need to restart to recreate new ones and not be in a confused state. I think the safest thing to do is make sure you stop and start Solr surrounding the stop and start of ZK.

3) Backing up indexes: This you can do with the replication handler. You can do a:

That will save a backup of the index in a current state that can be backed up elsewhere for retrieval.

4) Fusion's data directory. This has all sorts of data related to fusion found in $FUSION_HOME/data. This actually is likely a superset of much of the other data listed here and can contain for example, your solr data indexes and your zookeeper data. However, it is probably better to use the replication handler for backing up Solr indexes for each collection you wish to back up, rather than just simply copying the data directory off somewhere. 

5) You might want to consider for Solr using cdcr in 6.x (cross data center replication). Read more here:

Have more questions? Submit a request


Please sign in to leave a comment.
Powered by Zendesk