I need to find short url in the text post in java. I have the following regex expression
“(http://(bit\.ly|t\.co|lnkd\.in|tcrn\.ch).*?)\s”
I have 2 questions
-
The problem with the above expression is it doesn’t match the short
url if it is at the end of line. ex For text “blah
http://linkd.in/R9Msf3 blah” gives “http://linkd.in/R9Msf3 “But blah blah http://linkd.in/R9Msf3 does not gives
“http://linkd.in/R9Msf3”Any suggestions how to match both patterns ? Basically I just need
to replace the short url out of the text. -
Also is there a better way to get all the short url format? If I
hard code it then everytime I would have to add a new format to the
config.
Instead of
.*use\S*to avoid matching whitespace. You don’t need the?and you can use\binstead of\sto match the boundary between the end of the url and whitespace or end of string.