Here is a sample table that mimics my scenario:
COL_1 COL_2 COL_3 COL_4 LAST_COL
A P X NY 10
A P X NY 11
A P Y NY 12
A P Y NY 13
A P X NY 14
B Q X NY 15
B Q Y NY 16
B Q Y CA 17
B Q Y CA 18
The LAST_COL is a primary key so it will be different every time.
I want to ignore LAST_COL and gather some statistics related to the rest of the 4 columns.
Basically, I have millions of rows in my table, and I want to know which set of COL_1, COL_2, COL_3 and COL_4 are having most number of rows.
So, I want a query which can output me all the unique rows with their count of occurrences.
COL_1 COL_2 COL_3 COL_4 TOTAL
A P X NY 3
A P Y NY 2
B Q X NY 1
B Q Y NY 1
B Q Y CA 2
Thanks to anyone who helps me with this.
*I am using MS SQL, if that would make any difference.
If you ever want to weed out rows that don’t have a duplicate: