How do I load test my Solr or Fusion cluster against high query loads?
The example gives you a very good base which you can easily customize to read your search query logs and simulate them at high query rates to see how your cluster behaves. We use JMeter for the load testing.
I've attached a project that has a Java class which can be customized to feed in your queries and then let the JMeter framework fire them against your search cluster.
Here are the steps on how to run the load test. We use Solr 5.2.1 in this example.
- Download JMeter 2.13 from : http://mirror.olnevhost.net/pub/apache//jmeter/binaries/apache-jmeter-2.13.tgz
- After extracting JMeter, you need to go to the lib/ folder and remove httpclient-4.2.6 and httpcore-4.2.5.jar as SolrJ requires a newer version of these two JARs which are backwards incompatible with these versions.
- Customize the load testing script. More on this later.
- Build the project by running : mvn clean package . At this point we have our customized load testing script ready to be fed into JMeter to run the tests
- To load it up into JMeter we need to copy over the JAR into JMeter's classpath. So here is what do: cp ~/solr-load-test/target/solr-load-test-1.0-SNAPSHOT-jar-with-dependencies.jar ~/jmeter/apache-jmeter-2.13/lib/ext/
- To run the JMeter load test:
./jmeter -n -t ~/solr-load-test/src/main/resources/query.jmx -l query.log
query.log is where we log anything from the script
query.jmx is the configuration file which we specify the various parameters for the test.
Now more on the load generation script - it's called SimpleQuerySampler . Essentially you should load up your query logs in the setupTest method of the class. Then the runTest method can utilize the queries and fire them against your search cluster. Each call to runTest should fire one query and the total number of times you want JMeter to call it is configurable.
Now that we have got the load test running let's cover some JMeter configurations that you can set in the query.jmx file.
- Since the script can be used to fire queries against standalone Solr, SolrCloud or Fusion you need to specify your endpoint. Use the mode configuration to specify your search cluster. The options are solr, solrcloud and fusion.
- For mode=solr you need to additionally provide the COLLECTION and SOLR_URL parameters.
- For mode=solrcloud you need to additionally provide the ZK_HOST .
- For mode=fusion you need to additionally provide QUERY_PIPELINE, username and password
- You can use LoopController.loops to specify the number of times the runTest method will be called.
- You can use ThreadGroup.num_threads to specify how many threads should JMeter use to fire queries against your cluster.
Lastly I'd like to cover a couple of things which we should keep in mind while running these tests:
- Running it for an extended period of time is key. If we run it for too short a period then we might be hiding problems under the rug as things like high GC activity might only show up after an extended period of running the tests.
- Generating the queries is key. If we have too few unique queries and we keep reusing them then Solr will happily cache them and you might see wonderful test results but that might not be the case in real life.
- We are using the SolrJ client library to communicate with Solr. This uses a binary protocols for it's communications ( javabin ). However if you are not a Java shop and use xml or json for returning results then you should simulate getting results back in that format.
Hopefully this guide helps people getting started with running load tests. We could extend these tests to benchmark indexing documents also.