What are some Zookeeper best practices?
Zookeeper Best Practices (With Solr):
Always have an odd number of ZKs. Typically 3 are used, which allows up to 1 failure.
Provide very fast local storage. Local SSDs are preferred
Provide Zookeeper as a dedicated box.
Provide enough memory to ensure no swapping is done.
Provide a dedicated device for ZK transaction logs. Do not share a disk with other processes. This can cause contention and pauses.
Quorum should be on the same subnet. Requires consistent low latency across network to other peers. A caution to those who wish to cross subnets: Low latency does not necessarily mean 100% consistent low latency.
Monitor via four letter commands at zk port or JMX
Take care of cleaning up old snapshots and transaction logs. You can use the following command: java -cp zookeeper.jar:log4j.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count>
More information is found here: https://zookeeper.apache.org/doc/r3.4.5/zookeeperAdmin.html#sc_bestPractices