I’m trying to write a python script that modifies the contents of <script> tag

Question

0

Asked: May 23, 20262026-05-23T03:40:11+00:00 2026-05-23T03:40:11+00:00

I’m trying to write a python script that modifies the contents of <script> tag

0

I’m trying to write a python script that modifies the contents of <script> tag in files I’m parsing. I’m using lxml.html (as opposed to BeautifulSoup, etc.) for this due to its speed. The contents of script tag are surrounded in comment tags (<!– and –>):

<script>
<!--
...
-->
</script>

The problem is when I try something like scriptNode.text = '<!-- ... lxml modifies the angle brackets to their html representations (& lt; and & gt;) when I write the html back to file. I tried escaping them in the string (‘\< …’), but that doesn’t seem to help.

Looking at most modern websites, it looks like those comment tags are not needed. I can remove them, but many of the scripts also use some html within them and if those get modified to their HTML representation as well, that’s a problem.

I’m surprised that lxml is modifying this data at all, last I heard HTML parsers are designed to avoid modifying/interpreting data within <script> tags.

Is there a setting/command I can use to prevent this from happening?

Thanks

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T03:40:12+00:00

Editorial Team

2026-05-23T03:40:12+00:00Added an answer on May 23, 2026 at 3:40 am

Put them in a CDATA section.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to write a python script that modifies the contents of <script> tag

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply