Should I use Join or denormalize?

The biggest things to consider would be the limitations of Join:
http://wiki.apache.org/solr/Join#Limitations

So there are some biggies:
1) Can't access data from the "from" document like you can in SQL.
2) Can't access the scoring and sort by the "from" document match - every matching doc get's a score of 1.
3) Distributed environment needs all joined documents to reside on the same server.

Performance isn't that great with joins and this can depend on the number of unique values on the key (more = bad). The general recommendation is to denormalize if you can. This can obviously have some poor storage characteristics, however. But some more storage is usually cheaper than perhaps significant limitations. If you are worried about performance, you should probably be looking into block joins:

https://issues.apache.org/jira/browse/SOLR-3076.

Here is some data on how it performs in comparison to straight join:

http://blog.griddynamics.com/2012/08/block-join-query-performs.html

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.
Powered by Zendesk