zookeeper disconnects from Solr

There is a situation that you can get into whereby Zookeeper disconnects the Solr connections to it. In this particular situation what may be happening is that the object size of the overseer queue has gotten too big and Zookeeper disconnects any connection attempting to look at this object. 

To test this you can go to your Zookeeper installation and use zookeepers zkCli.sh (not Solr's zkcli.sh) to connect to your zookeeper ensemble. You can then do an cmd> ls /overseer/queue and if you get a disconnect with "Packet len is out of range" then you know you have hit this problem. 

Solution: Add as a JVM parameter to your zkCli.sh script -Djute.maxbuffer=49107800. If it fails again use a bigger number. You are trying to tell the client to use a bigger buffer so you can connect successfully. Once you can see your /overseer/queue you need to shut down solr and delete the /overseer/queue. You do this in zkCli.sh cmd> rmr /overseer/queue, but you should shut down all Solr's first. 

Once your queue has been deleted, this may take a while, you can restart Solr and it should come back up. 

