I just would like to delete HTML tag and then re-save file in same file..
text files have html tags..
import shutil
import os
import nltk
low_firm=['C:/a1.txt','C:/a2.txt']
for aa in low_firm:
f= open (aa,'r+')
for bb in f:
raw = nltk.clean_html(bb)
raw2=str(raw)
f.write(low_firm)
but it doesn’t work! I got a message….
IOError: [Errno 0] Error
I would open a file to read, read all its content as lines in a list, close the file and then reopen it to write on it:
This is because I feel it is easier to rewrite the entire file when it contains text (instead of register or other binary data). Almost always it is not too slow to be done on text files, since text files are not as big as, let us say, databases files. It may not be the better solution for you but I would recommend to try it anyway.