Problem:
I have about 50,000 rows in Excel. Each row contains a the word domain=[a-Z0-9]
where [a-Z0-9] is a placeholder for a bunch of numbers and text like a GUID. This domain ID let’s call abc123 it is unique. However in the 50,000 rows it is not a unique key for the table so I need to make it unique by removing all the other rows where domain ID = abc123. But I have to do this for all domains so I can’t be specific. I need a script to figure this out. The domain ID is always in the same column and there many different domain ID’s that repeat themselves.
Sample
column 2
abunchofstuff3123123khafadkfh23k4h23kh*DomainID=abc123*
Pseudo Code
//Whenever there is a value for domain in row i col 2
//does it already exist in ListOfUniqueDomains?
//if so then remove this row
//else add to the ListOfUniqueDomains
How would one do this with Excel/VBA?
UPDATED ANSWER
So I really liked the idea of using Pivot Tables but I still had to extract the domain ID so I thought I’d post the solution to that portion here. I actually stole the function from some other website while googling but I lost the original post to give proper credit. So forgive me if that person is you but give yourself a pat on the back and I’ll buy you lunch if you’re in my neighborhood (easy everyone).
So in my case I had 2 delimeters (=, &) for the string domain=abc123& which is embedded in a longer string. So to extract the domain ID I did the following.
Public Function extract_value(str As String) As String
Dim openPos As Integer
Dim closePos As Integer
Dim midBit As String
On Error Resume Next
openPos = InStr(str, "=") 'get the position of the equal sign
On Error Resume Next
closePos = InStr(str, "&") ' get the position of the &
On Error Resume Next
midBit = Mid(str, openPos + 1, closePos - 1)
'get the string that is between equal sign and before '&' however this seems
'greedy and so it 'picked up the last '&'.I used split to get the first occurrence
'of '&' because that was how my string was designed.
Dim s As String
s = Split(midBit, "&")(0)
extract_value = s
End Function
Is VBA even a good idea for something like this?
Thanks
I’ve done this for some fairly large file (50k rows) where I needed to extract only unique elements. What I’ve done is quite simple: use a pivot table. This way you don’t even need VBA, but if you want to process it further it’s still very simple to update the table and extract data.
One of the reasons I really love this method is that it is extremely easy and powerful at the same time. You have no looping or algorithm to write, it’s all right there in the Excel features.