Know your Zookeeper and check on its health – Lucidworks

Goal

How can I check the health of the Zookeeper?

Environment

Zookeeper

Guide

Why Zookeeper?

Zookeeper is used as a state machine for the Solr cluster to understand the cluster state and hold configuration information. It uses what are called ephemeral nodes so that if connections are dropped with Solr servers, those z-nodes associated with that connection will automatically drop as well. This allows Solr to understand the state of the cluster. Requests to Solr are NOT routed through the zookeeper. It holds a minimal of information and does a minimal amount of processing, thus having minimal resource requirements to run well.

Best practices

ZK should always be set up in a quorum of at least 3 zookeepers and always an odd number. Zookeeper needs a majority of its quorum up to respond to requests (2 of 3, 3 of 5, etc), so an even number is always a bad option
You do not need a lot of zookeepers. It does not route requests to Solr and does not do a significant amount of processing. Its main overhead is simply the number of connections it has open
ZK should have very fast disk IO latency and access to its disk. Do not put production ZK on nodes that do a high amount of IO to the same disk (such as with a Solr server that does any significant indexing). This is the biggest no-no of a ZK setup
Do not open up lots of unnecessary connections to Zookeeper through clients trying to index to Solr
Do not use a single load-balanced address for all your zks. The best practice for configuration is to use the full quorum address of all zk nodes. We have seen significant disconnects/reconnects on systems that use a single load-balanced address

Check on it’s health

Zookeeper has a bunch of admin commands called the four-letter commands. Documentation is here: https://zookeeper.apache.org/doc/r3.4.8/zookeeperAdmin.html#sc_zkCommands
The most common ones we use are ‘mntr’ and ‘cons’. For example:

cmd> echo 'mntr' | nc localhost 9983

zk_version 3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03, built on 06/29/2018 00:39 GMT

zk_avg_latency 0

zk_max_latency 0

zk_min_latency 0

zk_packets_received 4244

zk_packets_sent 4243

zk_num_alive_connections 1

zk_outstanding_requests 0

zk_server_state standalone

zk_znode_count 1199

zk_watch_count 0

zk_ephemerals_count 0

zk_approximate_data_size 3052823

zk_open_file_descriptor_count 43

zk_max_file_descriptor_count 10240

zk_fsync_threshold_exceed_count 0

 

echo 'cons' | nc localhost 9983

 /127.0.0.1:60301[1](queued=0,recved=59,sent=60,sid=0x100048351d80004,lop=PING,est=1574447048795,to=10000,lcxid=0x38,lzxid=0x45f4,lresp=82867730,llat=0,minlat=0,avglat=0,maxlat=6)

 /127.0.0.1:60300[1](queued=0,recved=8011,sent=8090,sid=0x100048351d80003,lop=GETD,est=1574447048686,to=30000,lcxid=0x1f4a,lzxid=0x45f4,lresp=82869974,llat=0,minlat=0,avglat=0,maxlat=7)

 /127.0.0.1:60331[0](queued=0,recved=1,sent=0)

We tend to look for high max latency numbers, the total number of alive connections, and whether or not the z-node counts are the same across the quorum. High latency could be either because of slow ZK, or slow Solr connectivity. Usually, it’s Solr that is slow unless you have disk IO issues on ZK.
Checking resources: you should be very sensitive to any view that says there is an IO Wait on your ZK system, or a slow disk.
It is possible to have the default total connections set too low, which defaults to 60
Beware f-sync errors in your logs. This shows you have a slow disk and zk is having trouble sending syncs to disk in a timely fashion