Issue:
I'm wondering what types of queries are taking the longest on my system? How would I find this info?
Environment:
Fusion, Solr
Resolution:
One thing customers often wonder is what types of queries are taking the longest on their system. An easy way to find that out via Solr logs is actually through the power of simple greps, awk and sort on the command line. I have found the following command extremely useful (obviously you can do things like put this into a script and/or change the solr.log to be an argument in the script):
grep '/select' solr.log | awk '{print $NF" "$11}' | awk -F 'QTime\=' '{print $2}' | awk '{print $2"|"$1}' | sort -rn -t\| -k 2,2 | head -200
What this does, piece by piece is to get all ‘/select’ statements in the solr log, print out the last token first and (hopefully) the parameters token. Different versions of solr logs print this differently, so the $11 likely may need to change. You are looking to print the token in the logs that corresponds to:
params={q=....}
$11 just means the eleventh token split by whitespace, so you just need to look at the file and see where that token is in the ‘/select’ statements in the log
This then will split that output based on QTime allowing you to then eventually print the specific parameters in the query sorted by the qtime and listing the top N results (in this case 200)
Just remember, sometimes long query times are because of long queries. Sometimes long queries are due to system problems and in fact have nothing to do with the query themselves. This is really only useful in diagnosing the former.
Cause:
Comments
1 comment
If it helps someone, here is another similar approach:
This first looks for "long" queries (1000 ms or more -- 4 digits or more for ms time)
Then filters out (-v means invert, or filter out) distrib false (remove -v, or similarly change the filter as you like).
Highlight the &q=... up to the next ampersand
OR
highlight the distrib string
-o should then return the matched strings.
Please sign in to leave a comment.