I’m building a page in asp.net that will use tiny mce to provide a rich text editor on the page. Tiny mce outputs the rich text as html which I would like to save to a database. Then at a later date, I want to pull the HTML from the database and display it in a page.
I’m concerned about allowing malicious html, js tags into my database that would later be output.
Can someone walk me through at what point in my process I should html encode/decode etc. to prevent a persistent xss attack and or sql injection attack?
We use the Microsoft Web Protection Library to scrape out any potentially dangerous HTML on the way in. What I mean by “on the way in” – when the page is posted to the server, we scrub the HTML using MS WPL and take the results of that and throw that into the database. Don’t even let any bad data get to your database, and you’ll be safer for it. As far as encoding, you won’t want to mess with HTML encoding/decoding – just take whatever is in your tinyMCE control, scrub it, and save it. Then on your display page, just write it out like it exists in your database into a literal control or something like that, and you should be good.
I believe
Microsoft.Security.Application.Sanitizer.GetSafeHtmlFragment(input)will do exactly what you want here.