I have a poorly maintained database that includes employee information. Human Resources requested a report that lists instances where the employee name associated with an insurance coverage does not match the name on the insurance policy.
There are inconsistencies in the formatting of the names in both tables. It’s always last name then first name, but you might see any of the following in either table for a fictional employee named Steven J. Smith:
- Smith, Steven
- Smith,Steven
- Smith, Steven J.
- Smith,Steven J.
I need to run a query looking for instances where EMPLOYEE.EMP_NAME <> INSURANCE.SUBSCRIBER_NAME while allowing for differences in name formatting as shown above (i.e. picking up that “Smith,Steven J.” and “Smith, Steven” are (probably) the same person and igonring them).
SELECT
EMPLOYEE.EMP_NO
, EMPLOYEE.EMP_NAME
, INSURANCE.SUBSCRIBER_NAME
, INSURANCE.PAYOR_NAME
FROM EMPLOYEE
INNER JOIN INSURANCE ON EMPLOYEE.EMP_NO = INSURANCE.EMP_NO
WHERE EMPLOYEE.EMP_NAME <> INSURANCE.SUBSCRIBER_NAME
I know I want to do a substring to ignore the middle initial, but how do I account for ignoring whether or not there is a space after the comma?
Thanks, your answers helped a lot. I ended up cutting the name into [lastname][firstname] with no spaces and cutting off the middle initial if it was there. Here’s what eventually worked in eliminating the vast majority of the same-name matches: