Im using lxml.html.cleaner to clean html from an input text. how can i change \n to <br /> in lxml.html?
Im using lxml.html.cleaner to clean html from an input text. how can i change
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Fairly easy, slightly hacky way: You could do this as part of a two step process, assuming you have used
lxml.html.parseor whichever method to build DOM.iterdescendantsmethod, which walks through everything for you.lxml.html.cleanas per normalA more complex way would be to monkey patch the
lxml.html.cleanmodule. Unlike lots oflxml, this module is written in Python and is fairly accessible. For example, there is currently a_substitute_whitespacefunction.