Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8574057
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T19:23:23+00:00 2026-06-11T19:23:23+00:00

This regular expression is supposed to match all non-ASCII characters, 0-128 code points: /[^x00-x7F]/i

  • 0

This regular expression is supposed to match all non-ASCII characters, 0-128 code points:

 /[^x00-x7F]/i

Imagine I want to test (just out of curiosity) this regular expression with all Unicode characters, 0-1114111 code points.

Generating this range maybe simple with range(0, 1114111). Then I should covert each decimal number to hexadecimal with dechex() function.

After that, how can i convert the hexadecimal number to the actual character? And how can exclude characters already in ASCII scheme?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T19:23:24+00:00Added an answer on June 11, 2026 at 7:23 pm

    It depends on how you are going to do the matching and whether you are going to put the PCRE regex engine into UTF-8 mode with the /u modifier.

    If you do use the /u modifier then first of all you must use UTF-8 encoding for both the regular expression and the subject and the regex engine will automatically interpret legal UTF-8 byte sequences as just one character. In this mode the regular expression [^x00-x7F] will match all characters outside the Latin-1 supplement block, including those with code points greater than 255. You will also need to generate the UTF-8 representations of each character (given its code point) manually.

    If you do not use the /u modifier then the regex engine will be dumb: it will consider each byte as a separate character, which means that you have to work at byte rather than character level. On the other hand, you will now be able to work with any encoding you prefer. However, you will have to ditch the [^x00-x7F] regex (because it’s only going to be matching random bytes in the string) and work with a regular expression that embodies the rules of your chosen encoding (example for UTF-8). To generate the encoded forms of random characters you will again need to use custom code that depends on the specific encoding.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Is this regular expression enough to catch all cross site scripting attempts when embedding
I want to use this regular expression in Python: <(?:[^]*[']*|'[^']*'[']*|[^'>])+> (from RegEx match open
I want to match D11-RONPLAYER_DEF_15_PO using this regular expression: D\[0-9]+-\[A-Z]*PLAYER_(DEF\[0-9]*)_(\[^_]+)_ but it does not
I keep getting an error for this regular expression: ^((([1-9])|(1[0-2])):([0-5])(0|5)/s(A|P)M)$ It's supposed to be
I have created a Regular Expression (using php) below; which must match ALL terms
I came across this regular expression in the jQuery source code: ... rmozilla =
What's wrong with this regular expression? It won't work var patt = /[0-9]{2}/[0-9]{2}/[1-9]{4}/; if(patt.test(document.getElementById('date').value)
How can I update this regular expression to find and replace not just /news/
I have this Regular Expression that matches the following strings: <!-- 09-02-2009 ---> <!--
I created this regular expression to validate names: ^[a-zA-Z0-9\s\-\,]+.\*?$ Is there a way add

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.