I’m trying to use a custom regex clean transformation (information found here ) to extract a post code from a mixed address column (Address3) and move it to a new column (Post Code)
Example of incoming data:
Address3: "London W12 9LZ"
Incoming data could be any combination of place names with a post code at the start, middle or end (or not at all).
Desired outcome:
Address3: "London"
Post Code: "W12 9LZ"
Essentially, in plain english, "move (not copy) any post code found from address3 into Post Code".
My regex skills aren’t brilliant but I’ve managed to get as far as extracting the post code and getting it into its own column using the following regex, matching from Address3 and replacing into Post Code:
Match Expression:
(?<stringOUT>([A-PR-UWYZa-pr-uwyz]([0-9]{1,2}|([A-HK-Ya-hk-y][0-9]|[A-HK-Ya-hk-y][0-9] ([0-9]|[ABEHMNPRV-Yabehmnprv-y]))|[0-9][A-HJKS-UWa-hjks-uw])\ {0,1}[0-9][ABD-HJLNP-UW-Zabd-hjlnp-uw-z]{2}|([Gg][Ii][Rr]\ 0[Aa][Aa])|([Ss][Aa][Nn]\ {0,1}[Tt][Aa]1)|([Bb][Ff][Pp][Oo]\ {0,1}([Cc]\/[Oo]\ )?[0-9]{1,4})|(([Aa][Ss][Cc][Nn]|[Bb][Bb][Nn][Dd]|[BFSbfs][Ii][Qq][Qq]|[Pp][Cc][Rr][Nn]|[Ss][Tt][Hh][Ll]|[Tt][Dd][Cc][Uu]|[Tt][Kk][Cc][Aa])\ {0,1}1[Zz][Zz])))
Replace Expression:
${stringOUT}
So this leaves me with:
Address3: "London W12 9LZ"
Post Code: "W12 9LZ"
My next thought is to keep the above match/replace, then add another to match anything that doesn’t match the above regex. I think it might be a negative lookahead but I can’t seem to make it work.
I’m using SSIS 2008 R2 and I think the regex clean transformation uses .net regex implementation.
Thanks.
Just solved this. As usual, it was simpler logic than I thought it should be. Instead of trying to match the non-post code strings and replace them with themselves, I have added another line matching the postcode again and replacing it with “”.
So in total, I have: