I’m trying to escape semicolon in by saxutils.escape method.
saxutils.escape('<;', {';': ';'})
I expect it to produce
'<;'
But it gives
'<;;'
Is this by design? And how can I get my expected result?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Your problem is that
saxutils.escapeworks in two steps. First, it parses<,>, and&, then it usesentitiesto parse the result of that parsing.So once
<has been replaced by<, you’ve got<;, so you end up with<;;.Basically, what it’s doing makes sense. If you need to escape semicolons, it’s not because of HTML reasons, so it must be to double-escape them. In this situation, it makes sense to escape the semicolons created by HTML-required escaping.
You can’t get your desired result with
saxutils.escape. You need to use another method of escaping. See the Python Wiki page on escaping HTML for some ideas.You can also use something like what is in my answer to What is the best way to do a find and replace of multiple queries on multiple files? to replace semicolons simultaneously with other patters so you don’t double-substitute anything.