Is there a way to search for fields that contain similar values in a sql db?
For example I have a table of over a million records where one column contains url values and is associated with a country column. Previously I tried to match urls that are equal where it contained a null value for the country as was able to update it using the following:
UPDATE t1
SET t1.country = t2.country
FROM Sources AS t1
JOIN sources AS t2
ON t1.url = t2.url;
Then I altered the query to use the like word as follows:
UPDATE t1
SET t1.country = t2.country
FROM Sources AS t1
JOIN sources AS t2
ON t1.url = t2.url
WHERE t1.url like t2.url;
when I just use the select statement to find the records where urls are like then I get the results but the update does not work.
A better example is as follows:
These are all the same domain url and I just want to update the country column for each one to avoid doing it manually because there are around 200000 to do.
How about:
See what kind of joins you get when you run that on your dataset…it may make too many bad matches.
At some point you’ll probably need to do some matching based on exact portions of the url, but i don’t know how to do that in a query like this. See this links for info:
http://www.w3schools.com/SQL/sql_wildcards.asp
Oh and if all the URLs contain the http://www. portion you could always do something like
That might cut down on your execution time and enfore better joins