I have a table that has duplicate email address, I need to insert just one of them into a temp a temp table along with two other fields. there are many example here but I can get any of them work,
I ended up looking into MERGE I get the same results. All the record are getting inserted I’m at a lost. I tried many different samples but it always insert all the records. I went back to make sure the email address are really dupes and they are.. Below is were I’m at now..
MERGE #EmailTable2 AS Target
USING (SELECT EMAIL, NAME, JOB_TITLE FROM b2b_cmas_list$ WHERE EMAIL IS NOT NULL) AS Source
ON (Target.EMAIL = Source.EMAIL)
WHEN NOT MATCHED BY TARGET THEN
INSERT (EMAIL, NAME, JOB_TITLE)
VALUES (Source.EMAIL, Source.NAME, Source.JOB_TITLE)
OUTPUT $action, inserted.*, deleted.*;
so any help in getting this correct would be helpful.
This it not working because SQL doesn’t know, which of the two rows containing the same e-mail you want to choose. I mean: if EMAIL is the same, which of pair (NAME and JOB_TITLE) are important and which can be discarded?
Some hints:
If it doesn’t matter which item is chosen simply group by EMAIL selecting MAX(NAME) and MAX(JOB_TITLE), i.e.
select EMAIL, max(NAME), max(JOB_TITLE) from b2b_cmas_list$ group by EMAIL
Be warned however that this can mangle NAME-JOB_TITLE pairs (as max is selected).
Try using ROW_NUMBER() OVER() to arbitrarilly select 1st row in each group.
Use a CURSOR to iterate over rows and skip duplicates.
Use .NET CLR aggregate to i.e. concat names and job titles for same e-mail.
And a little note to your MERGE statement. This is not working as expected, because SQL checks all rows at once, and not row-by-row. So it is not that if one e-mail. ie. “a@a.com” is inserted, then another won’t. It only matters if “a@a.com” is in the table at the beginning of the statement.