Is there a way I can do something like the following using the standard linux toolchain?
Let’s say the source at example.com/index.php is:
Hello, & world! "
How can I do something like this…
curl -s http://example.com/index.php | htmlentities
…that would print the following:
Hello, & world! "
Using only the standard linux toolchain?
Use
recode.EDIT: By the way,
recodeoffers several different conversions corresponding to different versions of HTML and XML, so you can use e.g.HTML_3.2instead ofHTML_4.0if you have a really old HTML document. Runningrecode -lwill list all the complete list of charsets supported by the program.