I am not good at advanced Excel formula and need help with data cleanup in excel for migration purpose. This is my scenario:
Column A Column B
DocName EmployeeName
ACCT6789_John_Smith_ACCOUNT_DOC_25_JAN_2007 John_Smith
ACCT1122_Jane_Doe_ACCOUNT_DOC_EID00022_21_DEC_2009 Jane_Doe
ACCT1462_Phil_Morris_ACCOUNT_DOC_EID0252 Phil_Moris
I need ColumnA’s string to find exact match against ColumnB. If matched, then delete that matched string from ColumnA. If no exact match, there will be no action and the script will continue checking the next row. I know there will be issue with double underscore, and I hope to only have one underscore between any 2 words.
The desired results for example above will be:
ACCT6789_ACCOUNT_DOC_25_JAN_2007 John_Smith
ACCT1122_ACCOUNT_DOC_EID00022_21_DEC_2009 Jane_Doe
ACCT1462_Phil_Morris_ACCOUNT_DOC_EID0252 Phil_Moris
Using just formulas in the spreadsheet, you can do the following:
Assume that column C is available (if not, find an empty column). Type this formula in row 1 (if you have headers in row 1, type it in the first available row and adjust the
1to the row number):This will find an occurrence of
B1(“John Doe”) inA1("ACCT6789_John_Smith_ACCOUNT_DOC_25_JAN_2007")In the next column over, you type
This takes “the bit of A1 before the string we found” plus “the bit after the thing we found”.
Now copy this formula all the way down (select both C1 and D1, then double clicking the little box in the bottom right hand corner is a shortcut to dragging all the way down…). Column D now has the results you want.
Finally, to copy them back to column A, select all of column D, hit copy, then do a Paste-Special-Values into column A. Finally, delete Columns C and D.
By the way – this method assumes there are underscores on either side of the name you found, and will actually remove the trailing one. If you can’t be sure of that, you should change the formula in Column D to
And then do another round to remove double underscores and replace them with single… It sounds to me like you don’t need that, though.