I’m trying to write some Perl to convert some HTML-based text over to MediaWiki format and hit the following problem: I want to search and replace within a delimited subsection of some text and wondered if anyone knew of a neat way to do it. My input stream is something like:
Please mail <a href="mailto:help@myco.com&Subject=Please help&Body=Please can some one help me out here">support.</a> if you want some help.
and I want to change Please help and Please can some one help me out here to Please%20help and Please%20can%20some%20one%20help%20me%20out%20here respectively, without changing any of the other spaces on the line.
Naturally, I also need to be able to cope with more than one such link on a line so splicing isn’t such a good option.
I’ve taken a good look round Perl tutorial sites (it’s not my first language) but didn’t come across anything like this as an example. Can anyone advise an elegant way of doing this?
Your task has two parts. Find and replace the
mailtoURIs – use a HTML parsing module for that. This topic is covered thoroughly on Stack Overflow.The other part is to canonicalise the URI. The module
URIis suitable for this purpose.