A common question that comes in is how many docs/shard should someone have. This has a very multifaceted answer. You might imagine that many clients have hardware with completely different specs, extraordinarily different queries and query and update SLAs which are all completely different, as well as loads put on the machines that are extremely different. So what this really requires is testing on your own setup and comparing against the required SLAs that are unique to you. I will say, in general the order of magnitude this makes sense for most people is generally between 10m-100m docs/shard, but still, everyone is different.
The best strategy for figuring this out is to test what is important to you. Is it update speed (although this mostly just calls for fewer replicas or replicas of non-NRT types)? Is it query speed? So you should choose essentially what I'd call a base machine which is the hardware you intend to put stuff on. Then start testing query speeds for your specific queries on different numbers of docs increasing them as you go on each test. Remember, we are not really needing to test load here, because if you need to add more load, you can add replicas. Once you get performance you feel comfortable with and otherwise starts to deteriorate past what is acceptable, you have a base level of number of docs you want per shard. Now, there will be some overhead for once you start sharding, so you likely want the number of docs to probably be less. The overhead will likely increase somewhat the more you shard. Also by sharding you need to ensure that you have good performance across your cluster, since a query is only as fast as it's slowest member.