Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1082421
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T22:17:48+00:00 2026-05-16T22:17:48+00:00

One of the first things I learned as a web developer was to never

  • 0

One of the first things I learned as a web developer was to never ever accept any HTML from the client. (Perhaps only if I HTML encode it.)
I use a WYSIWYG editor (TinyMCE) that outputs HTML. So far I have only used it on an admin page, but now I’d like to also use it on a forum. It has a BBCode module, but that seems to be incomplete. (It is possible that BBCode itself doesn’t support everything I want it to.)

So, here’s my idea:

I allow the client to directly POST some HTML code. Then, I check the code for sanity (well-formedness) and remove all tags, attributes, and CSS rules that are not allowed based on a pre-defined set of allowed tags and styles.
Obviously I would allow the stuff that can be outputted by the subset of TinyMCE functionality I use.

I would allow the following tags:
span, sub, sup, a, p, ul, ol, li, img, strong, em, br

With the following attributes:
style (for everything), href and title (for a), alt and src (for img)

And the following CSS rules:
color, font, font-size, font-weight, font-style, text-decoration

These cover everything that I need for formatting, and (as far as I know) don’t present any security risk. Basically, the enforcement of well-formedness and the lack of any layouting styles prevent anyone to hurt the layout of the site. The disallow of the script tag and the likes prevent XSS.
(One exception: maybe I should allow width/height in a predefined range for images.)

Other advantage: this stuff would save me from the need to write / look for a BBCode-Html converter.

What do you think?
Is this a secure thing to do?

(As I see, StackOverflow also allows some basic HTML in the “About Me” field, so I think I’m not the first one to implement this.)

EDIT:

I found this answer which explains how to do this fairly easily.
And of course, noone should think about using regex for this.

The question itself is not related to any language or technology, but if you are wondering, I write this application in ASP.NET.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T22:17:49+00:00Added an answer on May 16, 2026 at 10:17 pm

    It’s unclear what programming language you’re using or are preferring, but in Java there’s Jsoup, which is a pretty slick HTML parser API which contains among others a HTML cleaner based on a customizable whitelist of HTML tags and attributes (unfortunately no CSS rules since that’s completely out the scope of a HTML parser). Here’s an extract of relevance from its site.

    Sanitize untrusted HTML

    Problem

    You want to allow untrusted users to supply HTML for output on your website (e.g. as comment submission). You need to clean this HTML to avoid cross-site scripting (XSS) attacks.

    Solution

    Use the jsoup HTML Cleaner with a configuration specified by a Whitelist.

    String unsafe = 
          "<p><a href='http://example.com/' onclick='stealCookies()'>Link</a></p>";
    String safe = Jsoup.clean(unsafe, Whitelist.basic());
          // now: <p><a href="http://example.com/" rel="nofollow">Link</a></p>
    

    The Whitelist class itself contains several predefinied whitelists which may be of use, like Whitelist#basic() and Whitelist#relaxed().

    For .NET, there’s by the way a Jsoup port with the name NSoup

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Everything is an object was one of the first things I learned about Ruby,
I've installed PowerShell recently and one of the first things I started looking for
The first one is definitely something that works, but which one below is the
I'm using C++ .NET 2.0 I have 2 forms the first one is declared
I'm trying to add a space before every capital letter, except the first one.
When you take your first look at an Oracle database, one of the first
I'm a beginner with SQL and am working on one of my first databases.
First, let me use one sentence to let out some frustration: My god, developing
First, what does one call the ghost caption that appears in a text edit
I've written my first JQuery plugin and one of it's dependancies is an external

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.