Goal
Understand how to use the Solr collapse parameter in a SolrCloud environment within Fusion to return only one document per unique field value.
Environment
Fusion 5.x and above
SolrCloud collections with multiple shards
Guide
Ensure documents are on the same shard
The Solr collapse parameter works on a per-shard basis. If documents sharing the same collapse field value are stored on different shards, the collapse operation will not combine them into a single result set entry.
To ensure proper collapsing behavior across the entire dataset:
To ensure document co-location, you can define the
router.nameparameter ascompositeIdwhen creating the collection.Route documents so that those intended for collapse by a specific field value are stored in the same shard.
If documents are distributed across different shards, the Collapse Parameter may not consistently yield the expected results. To achieve the desired collapsing behavior, it's essential to ensure that the documents intended for collapse share the same shard and this can be achieved through proper document routing strategies. For more details on document routing in SolrCloud, see the SolrCloud document routing guide.
Example collapse parameter usage
Collapse on group_field selecting the document in each group with the highest scoring document, add the following to your query parameters:
fq={!collapse field=group_field}For more information on collapse results, please refer to Solr documentation.