The hashDocSet is an optimization specified in the solrconfig.xml that enables an int hash representation for filters (docSets) when the number of items in the set is less than maxSize. For smaller sets, this representation is more memory efficient, more efficient to iterate, and faster to take intersections.
The hashDocSet max size should be based primarliy on the number of documents in the collection — the larger the number of documents, the larger the hashDocSet max size. You will have to do a bit of trial-and-error to arrive at the optimal number:
- Calulate 0.005 of the total number of documents that you are going to store.
- Try values on either ‘side’ of that value to arrive at the best query times.
- When query times seem to plateau, and performance doesn’t show much difference between the higher number and the lower, use the higher.