Bringing up downed Solr servers that don't want to come up

There are times where sometimes you will restart Solr and the servers don't seem to come back up. They are listed in Cloud as being "Down". And they never come back. 

This is a list of things to start to think about when trying to get servers back up:

1) the softest thing to do is do a REQUESTRECOVERY on each core: http://<host>:<port>/solr/admin/cores?action=REQUESTRECOVERY&core=<corename>

Sometimes this won't work. 

2) Now you can check to see if the overseer queue is backed up in Zookeeper. This is a known bug in lower versions of Solr (I'll have to update this to get the right version and bug). But what you can do is bring Solr down and then blow away the /overseer/queue (check all the queues under /overseer) in Zookeeper. Then bring your Solr servers back up. This should put it into a clean state to easily come back online. 

3) If you don't have a leader, then you can take down all servers of a shard and just bring one of them up. It can become leader after you restart and wait. Then you can start adding in your other replicas to get your whole shard up after it's been assigned leadership.  

4) Sometimes you keep getting sync failed from a replica and it won't come up. You can bring that server down, delete the indexes from it and bring it back up assuming you have a leader that you can download the full indexes from. 

 

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.
Powered by Zendesk