Got a slightly tricky problem I’d like some advice on. I have a source SQL table that I need to separate into two separate tables. The source contains a bunch of columns, including a postal address.
What I need to do is add the rows containing first instance of any address to the first table, Then any subsequent occurrences of the same address (if any) are to be added to the second table, with an additional column containing the ID of the row in the first table (for joining purposes).
Can anyone suggest how to do this? I’d prefer to keep it all within T-SQL (MS SQL Server 2008), but if necessary I can knock together some C# as part of the process.
You should be able to create a suitable CTE (Common Table Expression) do divide those addresses apart. You didn’t provide much details – so I tried to make up some column and table names – adapt those as needed.
Try this – it will delete all duplicates from your table:
With this CTE, you “partition” your data according to a list of columns – use those you need to use. Each partition of data (e.g. each address defined by
Street,ZipCode,City) will get sequential numbers, based on some ordering (that you define in the CTE – I choseDateLastChangeas one possibility).So the rows with
RowNum = 1make up the set of “first” addresses – instead of just aSELECT (columns) FROM DuplicateAddr, you can of course also do anINSERT INTO ....based on that criteria.All the rows with a
RowNum > 1are the “duplicate” or additional addresses – insert those into your second table.