My java program needs to rewrite urls in html (just in time). I am looking for the right tool and wonder if antlr is doing the job for me?
For example:
<html><body> <img src="foo.jpg" /> </body></html>
should be rewritten as:
<html><body> <img src="http://foo.com/foo.jpg" /> </body></html>
I want to read/write from/to a stream (byte by byte).
As khmarbaise said, first make sure, if regular expressions can do it. But there are cases, in which they can’t [*], and then I think, ANTLR might really be a legitimate choice.
[*] For the mathematical background on this, see http://en.wikipedia.org/wiki/Formal_grammar#The_Chomsky_hierarchy
Update
Now that you updated your question, I see what you really want to do: For modifying a complete HTML file, I’d use a parser like NekoHTML, or something similar: http://www.benmccann.com/dev-blog/java-html-parsing-library-comparison/
Then you can use these to extract the URL. Then
Do not use regular expressions to parse the entire HTML file! You could use ANTLR for that in theory, but it would be very hard to make that work reliably.