What regex would match the link below with line breaks in it and “=”. I am currently using this regex but it is not matching all of it:
((https?://)?([-\w]+\.[-\w\.]+)+\w(:\d+)?(/([-\w/_\.\,]*(\?\S+)?)?)*)
This is the example link:
http://www.linkedin.com/e/-eiijvz-h8zq2onn-2/VHWTzmPYQo40LPs2VhS6b_Nyx0MiE=
3in240VQyyWqfjjL007hj1UF1JEF-nYdDR/blk/I319184359351_65/0UcDpKqiRzolZKqiRybmR=
SrCBvrmRLoORIrmkZt5YCpnlOt3RApnhMpmdzgmasdhxrSNBszYRdBYNdjcVe34Vcjd9bSRjjS5dh=
CAQbPoUdzATdjsScPALrCBxbOYWrSlI/eml-comm_invm-b-in_ac-inv28/?hs=3Dfalse&to=
k=3D2PRdy1KvKbNls1
I had the same problem – spammers trying to obfuscate their URLs by breaking them up multiple times with ‘=\n’.
Try this regex – it seems to work pretty well. It matches URLs that are broken TWO OR MORE times. It’s unlikely a valid URL will be broken up in this manner more than once.
And if they’re putting breaks within the http, there’s this (a little hacky)