I am using Solr Grouping result. But it behaves wrong.

I grouped on the base of email field. In my database i have 2 rows for the email address “ashhaf63@hotmail.com” but SOLR showing 7147 numFound which is of course wrong, it should be 2.
When i try to search with particular email address like “ashhaf63@hotmail.com” with grouping then it shows perfect result and it shows 2 in the numFound attribute which is correct.

I believe that, it is because of field type in my Solr schema, i was using text field type in start but i have changed it to my own field type now. I have defined my own field type and using my own field type now.


After using my own field type, i am facing same issue.
If you look at your grouping response, you will see it matched 7147 documents because the group consists of all emails that have “hotmail” in their address
<str name="groupValue">hotmail</str>and not the entire email address. This behavior is because you are expecting the values in the fields to be indexed as complete strings, like “ashhaf63@hotmail.com”, but from the definition of youremailfieldType you are doing some tokenizing of the field values. That is resulting in multiple indexed values for that field. Specifically the StandardTokenizerFactory that splits a value on all non alphanumeric characters. So that same email address is being indexed as three separate values, “ashhaf63”, “hotmail” and “com”.Because of this, I would recommend creating a new field that uses a simple string fieldType like the following:
Then create a new field like this:
Then perform your grouping on this new
emailaddressfield, which will group on the entire email address value.