Default output behavior
By default, Solr outputs its log to console by stderr, so a way to redirect Solr's log is by running it with the command: java -jar start.jar 2> output.log. This way you won't have any log rolling. This is only an easy way to start working while developing, but it's not recommended for production environments.
Solr uses for logging the API SLF4J, which means that many different logging tools can be easily attached. By default, it comes with java.util.logging, but it can be changed to, for example log4J by changing the proper jars. See http://wiki.apache.org/solr/SolrLogging
What's included in the logs
Of course, the logs will show different things depending on the configuration. In this doc, I'll only consider and mention some things that I consider can help operations and are shown by default (by using INFO level)
Immediately after starting Solr some lines will show information related to the configuration that can be very helpful, specially on the development stage of the projects for troubleshooting
Solr Home is the directory where the configuration of Solr's different cores are located. The solr.xml file will be located in the "solr home" directory, linking to where each core configuration is located. Typically, the cores are located inside the "solr home", under a directory with the same name as the core. One of the first things that the logger will show is the information about the "Solr Home" directory. For example:
After the solr.xml file is located, Solr will display information about the directory to use for each of the cores.
For information about how to set the Solr Home see http://wiki.apache.org/solr/SolrInstall
During bootstrap, Solr will display all the external jars added to the classpath. Those jars can be configured (added or removed) from the schema.xml file. If a tool that requires an external jar (for example, language identification) is failing due to ClassNotFound or related exception, a typical problem is that the jar is not being correctly added to the classpath. These lines look like:
After the used Jars, Solr will display information about the configuration to use for each core, including the field types, the default field, uniqueKey, default operator, and all solrconfig.xml information (request handlers, response writers, etc).
This is all interesting information that can be extracted from the logs, specially for troubleshooting during development. See how the last two lines show where the "request" logs are generated (this is not the exact same log this document is describing but one where Jetty outputs all the requests to the server) plus the port where Solr is running.
A typical search in Solr will be logged also with INFO level like the following example:
The log line will show the path were the request was issued (which shows the target request handler) plus all the parameters used explicitly in the query. Parameters that are configured in the solrconfig file are not displayed in this line. After the parameters, the log line shows: the number of hits for the query, the status (0 meaning OK) and the time spent in this request in milliseconds.
When using distributed search, the log output is slightly different than on single instances requests. The reason why the log output is different is because evidently the distributed search works in a different way than single instance search. When a Solr instance receives a distributed request (lets say, the server A), it will distribute the request to all of the Solr instances specified in the "shards" parameter (for example, servers B and C). With the request, A will send all the original parameters except for the "shards" parameter, and will include a parameter called "isShard", so that B and C know that this request is actually part of a distributed search. B and C will respond to A with the ids and the score of each of the documents matching the query (in case of sorting by score, otherwise the sorting fields are included in the response, for this, the "fl" parameter will also change). With this information, A sort by score, and request B and/or C the documents with a given set of IDs (the ones that made the top N list). Once B and C respond with those documents, A will respond to the user query with the list of top rated documents for the search.
Continuing with the above example, the server A will receive a request like:
A will distribute the request to B and C and they will log something like this:
This looks very similar to a regular request, but it has some parameters that were added automatically. The above line also logs the number of hits for this shard, the status and the query time for this shard. Note that these are partial QTime and partial hits, just for this shard.
A will get the response from B and C and will sort the results by score. Once it does it, it will request specific documents to them, that's why you'll see them log a line like:
Finally, A will respond to the distributed search and log:
This line will not show the total number of matches for the search, but it will show the status and the total query time for the request.
This is a typical log section for "adds":
The first line shows the IDs of the documents that were added in the request. If the number of documents added in the request is big, not all of the IDs will be displayed and instead, just the first ones plus the total number will be shown. It also shows the status (the first number, being 0 = OK) and the request time in milliseconds.The last line of the snippet shows the path of the request plus the parameters used for the request and (again) the status and time spent in ms.
A commit operation will be displayed like:
Shows when the commit is issued and the parameters used for it.
This information belongs to the "Deletion Policy". The Deletion policy determines when a a commit point is going to be deleted (the default is to keep just one commit, this can be configured through solrconfig.xml). The two lines show the two existing commits (the old one and the one just created) with the files they contain. Currently Solr can't make much use of older commit points.
This line shows the version of the last commit. This is going to be used by the replication handler (along with the index generation).
The name of the Searcher instance that was created after the commit. Remember that a new SolrIndexSearcher is created after each commit.
In Solr, caches are owned by the Index Searcher. Each Index Searcher contains a set of caches, and when a new one is created (after a commit operation for example), it will automatically warm its caches by getting cache items from the existing Index Searcher (the one currently being used to respond queries). That's the information that is being logged in the above lines. The second line displays the new searcher (that is being warmed) and the old (current) searcher. The third line, information about the actual cache of the old searcher.
The fifth and the sixth lines show information about the new index searcher and the new cache after warming it.
This is repeated for each of the different Solr caches (fieldValueCache, filterCache, documentCache and queryResultCache).
This means that the new searcher has been registered. All the new search requests will be responded by the newly created index searcher (which means that the new added documents are going to be available)
This means that the old searcher was closed, all the requests that the old searcher was processing were responded. It also displays the final status of the searcher's cache. This information is sometimes valuable for sizing caches.
This means that the commit operation was finised in 657 ms (status 0 means OK)
With no changes in the master
The slave will periodically request to the master the version and generation of its index. For each of those request, the master will output:
and if there hasn't been changes (master's version and generation is the same as slave's), the slave will log:
With changes in the master
When replication is enabled and there are changes in the master, Solr will log all the steps that it will reproduce to copy the master's index to the slave and make it available for searches. Those steps are described here: http://wiki.apache.org/solr/SolrReplication#How_does_it_work.3F
This shows the version and generation of the indexes. In this case, the master's generation is grater, this means that the index will be copied to the slave.
The slave will then ask the master for its list of files. The request will look like this in the master:
And after the response, the slave will output that number and see which files are present in its index and which aren't. The slave will output lines like:
In this case, the master has a total of 34 files, but some of them don't need to be downloaded, as they are already present in the slave (from a previous replication), for example _2_0.frq and _1_0.frq.
After this, the slave will request for those files that need to be downloaded. This will be one request per file and in the master it will look like:
After the new files are downloaded, the slave will proceed to commit it's own index (to make the new index available for searches), so the log will display lines exactly like the ones mentioned in the "commit" section.