I want to do THIS, just a little bit more complicated:
Lets say, I have an HTML input:
<a href="http://www.example.com" title="Bla @test blubb">Don't break!</a>
Some Twitter Users: @codinghorror, @spolsky, @jarrod_dixon and @blam4c.
You can't reach me at blam4c@example.com.
Is there a good RegEx to replace the twitter username mentions by links to twitter, but leave @example (eMail-Adress at the bottom) AND @test (in the link title, i.e. in HTML tags)?
It probably should also try to not add links inside existing links, i.e. not break this:
<a href="http://www.example.com">Hello @someone there!</a>
My current attempt is to add “>” at the beginning of the string, then use this RegEx:
Search: '/>([^<]*\s)\@([a-z0-9_]+)([\s,.!?])/i'
Replace: '>\1<a href="http://twitter.com/\2">@\2</a>\3'
Then remove the “>” I added in step 1.
But that won’t match anything but the “@blam4c”. I know WHY it does so, that’s not the problem.
I would like to find a solution that finds and replaces all twitter user name mentions without destroying the HTML. Maybe it might even be better to code this without RegEx?
First, keep the angle brackets out of your regexps.
Use a HTML parser and xpath to select the text nodes you are interested in processing, then consider a regexp for matching only @refs in those nodes.
I’ll let to other people to try and give a specific answer to the regex part.