Issue
Identical queries to a Fusion collection may return inconsistent results, such as differences in document scoring or result order, even when the data has not changed. This can manifest as small variations in the returned results each time the same query is run.
Diagnosis
This behavior is often seen in environments where Solr collections use Near Real Time (NRT) replicas. With NRT replicas, document adds, deletes, and segment merges happen in different orders across replicas, which results in variations in document scores. As a result, the same query sent to different replicas may return results in a different order or with different scores.
To diagnose this:
- Check the type of Solr replicas configured for the affected collection to see if they're NRT.
- Determine if inconsistent results correlate with queries being routed to different replicas. If you get two distinct sets of results and have two replicas, that is strong evidence of this being the cause. More generally, if you have n replicas and get n distinct sets of results, it points toward this being the cause.
Environment
Fusion 5.x
Cause
NRT replicas in Solr index documents individually and do not synchronize term statistics in real time. Therefore, identical queries routed to different NRT replicas may produce slightly different scoring, and as a result slightly different result ordering. This is expected behavior in Solr when using NRT replicas and does not indicate a synchronization problem or data loss.
Resolution
To resolve inconsistent query results caused by NRT replica behavior, consider the following actions:
Option 1: Accept minor scoring variations
If small differences in document scores do not impact application behavior or user experience, it is safe to accept this behavior when using NRT replicas.
Option 2: Change replica types to TLOG/PULL for consistency
For applications where consistent scoring and result ordering are required, configure Solr collections to use TLOG and PULL replicas instead of NRT replicas. TLOG replicas provide tight synchronization for scoring and are suitable for most query workloads. PULL replicas also ensure consistency but may introduce a slight delay in data visibility.
Note: Changes to Solr replica types can affect indexing throughput, resource consumption, and failover behavior. Review Solr documentation and test all configuration changes in a staging environment before applying them to production.