Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6991525
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T19:30:53+00:00 2026-05-27T19:30:53+00:00

When receiving user input on forms I want to detect whether fields like username

  • 0

When receiving user input on forms I want to detect whether fields like “username” or “address” does not contain markup that has a special meaning in XML (RSS feeds) or (X)HTML (when displayed).

So which of these is the correct way to detect whether the input entered doesn’t contain any special characters in HTML and XML context?

if (mb_strpos($data, '<') === FALSE AND mb_strpos($data, '>') === FALSE)

or

if (htmlspecialchars($data, ENT_NOQUOTES, 'UTF-8') === $data)

or

if (preg_match("/[^\p{L}\-.']/u", $text)) // problem: also caches symbols

Have I missed anything else,like byte sequences or other tricky ways to get markup tags around things like “javascript:”? As far as I’m aware, all XSS and CSFR attacks require < or > around the values to get the browser to execute the code (well at least from Internet Explorer 6 or later anyway) – is this correct?

I am not looking for something to reduce or filter input. I just want to locate dangerous character sequences when used in XML or HTML context. (strip_tags() is horribly unsafe. As the manual says, it doesn’t check for malformed HTML.)

Update

I think I need to clarify that there are a lot people mistaking this question for a question about basic security via “escaping” or “filtering” dangerous characters. This is not that question, and most of the simple answers given wouldn’t solve that problem anyway.

Update 2: Example

  • User submits input
  • if (mb_strpos($data, '<') === FALSE AND mb_strpos($data, '>') === FALSE)
  • I save it

Now that the data is in my application I do two things with it – 1) display in a format like HTML – or 2) display inside a format element for editing.

The first one is safe in XML and HTML context

<h2><?php print $input; ?></h2>'
<xml><item><?php print $input; ?></item></xml>

The second form is more dangerous, but it should still be safe:

<input value="<?php print htmlspecialchars($input, ENT_QUOTES, 'UTF-8');?>">

Update 3: Working Code

You can download the gist I created and run the code as a text or HTML response to see what I’m talking about. This simple check passes the http://ha.ckers.org XSS Cheat Sheet, and I can’t find anything that makes it though. (I’m ignoring Internet Explorer 6 and below).

I started another bounty to award someone that can show a problem with this approach or a weakness in its implementation.

Update 4: Ask a DOM

It’s the DOM that we want to protect – so why not just ask it? Timur’s answer lead to this:

function not_markup($string)
{
    libxml_use_internal_errors(true);
    if ($xml = simplexml_load_string("<root>$string</root>"))
    {
        return $xml->children()->count() === 0;
    }
}

if (not_markup($_POST['title'])) ...
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T19:30:54+00:00Added an answer on May 27, 2026 at 7:30 pm

    I don’t think you need to implement a huge algorithm to check if string has unsafe data – filters and regular expressions do the work. But, if you need a more complex check, maybe this will fit your needs:

    <?php
    $strings = array();
    $strings[] = <<<EOD
        ';alert(String.fromCharCode(88,83,83))//\';alert(String.fromCharCode(88,83,83))//";alert(String.fromCharCode(88,83,83))//\";alert(String.fromCharCode(88,83,83))//--></SCRIPT>">'><SCRIPT>alert(String.fromCharCode(88,83,83))</SCRIPT>
    EOD;
    $strings[] = <<<EOD
        '';!--"<XSS>=&{()}
    EOD;
    $strings[] = <<<EOD
        <SCRIPT SRC=http://ha.ckers.org/xss.js></SCRIPT>
    EOD;
    $strings[] = <<<EOD
        This is a safe text
    EOD;
    $strings[] = <<<EOD
        <IMG SRC="javascript:alert('XSS');">
    EOD;
    $strings[] = <<<EOD
        <IMG SRC=javascript:alert('XSS')>
    EOD;
    $strings[] = <<<EOD
        <IMG SRC=&#106;&#97;&#118;&#97;&#115;&#99;&#114;&#105;&#112;&#116;&#58;&#97;&#108;&#101;&#114;&#116;&#40;&#39;&#88;&#83;&#83;&#39;&#41;>
    EOD;
    $strings[] = <<<EOD
        perl -e 'print "<IMG SRC=java\0script:alert(\"XSS\")>";' > out
    EOD;
    $strings[] = <<<EOD
        <SCRIPT/XSS SRC="http://ha.ckers.org/xss.js"></SCRIPT>
    EOD;
    $strings[] = <<<EOD
        </TITLE><SCRIPT>alert("XSS");</SCRIPT>
    EOD;
    
    
    
    libxml_use_internal_errors(true);
    $sourceXML = '<root><element>value</element></root>';
    $sourceXMLDocument = simplexml_load_string($sourceXML);
    $sourceCount = $sourceXMLDocument->children()->count();
    
    foreach( $strings as $string ){
        $unsafe = false;
        $XML = '<root><element>'.$string.'</element></root>';
        $XMLDocument = simplexml_load_string($XML);
        if( $XMLDocument===false ){
            $unsafe = true;
        }else{
    
            $count = $XMLDocument->children()->count();
            if( $count!=$sourceCount ){
                $unsafe = true;
            }
        }
    
        echo ($unsafe?'Unsafe':'Safe').': <pre>'.htmlspecialchars($string,ENT_QUOTES,'utf-8').'</pre><br />'."\n";
    }
    ?>
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to make a simple application form were user can input data like
What is the preferred/easiest way to find the control that is currently receiving user
This post started as a question on ServerFault ( https://serverfault.com/questions/131156/user-receiving-partial-downloads ) but I determined
I'm having problems with my application receiving low memory warnings while the user is
I have a PHP page receiving input from an HTML form via $_POST. The
I have several text fields which compromise a registration form . When the user
I have an application which requires sending sms from one user to another.On receiving
I have a form with two input fields and a display text div: <div
I am developing a web and want to make it so that the user
I want to invoke a method of an object from user defined class for

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.