Issue
When querying fields in nested child documents, Solr may return unexpected or incorrect results. For example, querying for a field value in child documents can return results that include documents with differing field values.
Diagnosis
To determine whether this issue applies:
-
You are indexing parent-child documents using Solr’s nested document feature.
-
You observe inconsistent query results when querying a specific child field using standard queries or the
{!parent}block join query parser. -
Inconsistent behavior is more apparent when the Solr collection has multiple shards and replicas.
-
Child documents across multiple parents share the same value for the
idfield.
To confirm, inspect your indexed documents and verify if child documents have duplicate id values across different parents.
Environment
Solr 8.11.2 and above, particularly when indexing and querying nested documents in collections with multiple shards and replicas.
Cause
Solr relies on each document having a unique identifier. When multiple nested (child) documents across different parent documents are indexed with identical id values, Solr's internal document collection and shard merging logic can return inconsistent results. This is especially likely in distributed collections with multiple shards where document routing and caching rely on unique document keys.
Resolution
Ensure that all child documents indexed into Solr have unique id values, even if their parent documents are similar. Solr's nested document functionality expects uniqueness to properly associate and retrieve child documents during queries.
Reindexing with unique child IDs
Modify your indexing logic to generate unique id values for each child document. This can include suffixing the parent ID or using UUIDs. For example:
{
"id": "parent-001",
"type": "parent",
"_childDocuments_": [
{
"id": "parent-001-child-001",
"type": "child",
"workAssignmentsHomeWorkLocationAddressStateCodeValue_s": "CA"
},
{
"id": "parent-001-child-002",
"type": "child",
"workAssignmentsHomeWorkLocationAddressStateCodeValue_s": "AZ"
}
]
}
Verifying correct query behavior
After reindexing with unique child IDs, run a query like the following to ensure expected results:
http://<solr-host>/solr/<collection>/select?
q=workAssignmentsHomeWorkLocationAddressStateCodeValue_s:CA&
fq=id_type:W&
fl=workAssignmentsHomeWorkLocationAddressStateCodeValue_s&
q.op=OR&
indent=true
Use debug=true or debug=timing to verify shard responses and ensure no documents are incorrectly matched from other shards due to ID conflicts.
If using block join queries, also ensure you query using the appropriate parent-child relationship syntax.
http://<solr-host>/solr/<collection>/select?
q={!parent which='id_type:WP'}workAssignmentsHomeWorkLocationAddressStateCodeValue_s:CA&
fl=*,[child]&indent=true
This approach ensures Solr retrieves child documents correctly within their hierarchical context.