I am reading a text file and putting its content within a tag in an xml output file. The problem I am facing is that the input text file contains some control characters like <96> or <92> which cause my script to output invalid xml.
How can I convert these control characters to corresponding numerical HTML entities so that there is no data loss and the resulting file is valid as well?
I have tried:
perl -p -i -e 's/\x96/\&\#150\;/g; s/\x92/\&\#146\;/g;' out_xml
But I would like to convert any control characters to HTML entities.
HTML::Entities does what you want: