I have a text file with ; used as the delimiter. The problem is

Question

0

Asked: May 12, 20262026-05-12T18:40:19+00:00 2026-05-12T18:40:19+00:00

I have a text file with ; used as the delimiter. The problem is

0

I have a text file with ; used as the delimiter. The problem is that it has some html text formatting in it such as > Obviously the ; in this causes problems.
The text file is large and I don’t have a list of these html strings, that is there are many different examples such as $amp;. How can I remove all of them using python.
The file is a list of names, addresses, phone number and a few more fields. I am looking for the crap.html.remove(textfile) module

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-12T18:40:19+00:00

Editorial Team

2026-05-12T18:40:19+00:00Added an answer on May 12, 2026 at 6:40 pm

The quickest way is probably to use the undocumented but so far stable unescape method in HTMLParser:

import HTMLParser
s= HTMLParser.HTMLParser().unescape(s)

Note this will necessarily output a Unicode string, so if you have any non-ASCII bytes in there you will need to s.decode(encoding) first.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a text file with ; used as the delimiter. The problem is

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply