I have a string that is HTML encoded:
'''<img class="size-medium wp-image-113"\ style="margin-left: 15px;" title="su1"\ src="http://blah.org/wp-content/uploads/2008/10/su1-300x194.jpg"\ alt="" width="300" height="194" />'''
I want to change that to:
<img class='size-medium wp-image-113' style='margin-left: 15px;' title='su1' src='http://blah.org/wp-content/uploads/2008/10/su1-300x194.jpg' alt='' width='300' height='194' />
I want this to register as HTML so that it is rendered as an image by the browser instead of being displayed as text.
The string is stored like that because I am using a web-scraping tool called BeautifulSoup, it ‘scans’ a web-page and gets certain content from it, then returns the string in that format.
I’ve found how to do this in C# but not in Python. Can someone help me out?
Given the Django use case, there are two answers to this. Here is its
django.utils.html.escapefunction, for reference:To reverse this, the Cheetah function described in Jake’s answer should work, but is missing the single-quote. This version includes an updated tuple, with the order of replacement reversed to avoid symmetric problems:
This, however, is not a general solution; it is only appropriate for strings encoded with
django.utils.html.escape. More generally, it is a good idea to stick with the standard library:As a suggestion: it may make more sense to store the HTML unescaped in your database. It’d be worth looking into getting unescaped results back from BeautifulSoup if possible, and avoiding this process altogether.
With Django, escaping only occurs during template rendering; so to prevent escaping you just tell the templating engine not to escape your string. To do that, use one of these options in your template: