Query results mismatch due to incorrect filter order in Solr analysis chain – Lucidworks

Issue

When searching for part numbers ending in uppercase "S" (e.g., AB12CDES), Solr stems the term to ab12cde during query parsing, but the term is stored in the index as ab12cdes, resulting in no matches. The search term fails to match even when enclosed in double quotes.

Diagnosis

This issue arises from a mismatch in the order of filters used in the Solr schema’s index-time and query-time analyzers. Specifically, in the index analyzer for the _text_ field (typically using the text_general type), the LowerCaseFilterFactory is applied after the stemmer, whereas in the query analyzer, it is applied before the stemmer.

When the lowercase filter is applied before the stemmer at query time, the analyzer converts AB12CDES to ab12cdes, which is then incorrectly stemmed to ab12cde. This token does not match the indexed form ab12cdes. The stemmer does this because it assumes a string ending in
"s" is a plural word.

Environment

Fusion 5.x
Solr schema field type: text_general
Solr field name: _text_
Issue occurs when stemming is enabled at both index and query time

Cause

The inconsistency in filter application order leads to different tokenization between indexing and querying. Specifically:

Index-time analyzer applies the lowercase filter after stemming.
Query-time analyzer applies the lowercase filter before stemming.

This order causes query terms to be lowercased before being stemmed, leading to incorrect stemming behavior.

Resolution

To fix the mismatch and preserve expected tokenization, reorder the query-time analysis chain to apply the LowerCaseFilterFactory after the stemmer, matching the index-time analyzer.

Update the query analyzer in the Solr schema configuration as follows:

<analyzer type="query">
  <tokenizer class="solr.StandardTokenizerFactory"/>
  <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
  <filter class="solr.SynonymGraphFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
  <filter class="solr.EnglishPossessiveFilterFactory"/>
  <filter class="solr.EnglishMinimalStemFilterFactory"/>
  <filter class="solr.LowerCaseFilterFactory"/>
</analyzer>

Note: Always validate schema changes in a lower environment before applying to production.

Issue

Diagnosis

Environment

Cause

Resolution

Related articles