Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8469719
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T16:18:24+00:00 2026-06-10T16:18:24+00:00

In .NET is there a way to enumerate all the values for \w? As

  • 0

In .NET is there a way to enumerate all the values for \w?

As for why I am parsing words from unknown files. Will come across some files that use embedding that are nothing but non standard. See sample below

“PK!RýëÙ*[Content_Types].xml ¢( Ì?ÍNã0?÷Hó?·£Æ530̨)?Y!@?ycß6VÛò5о=7)T*­””””áM«üø?ïºÕ?Ïä|ÙØâ” “ï*&Ê1+À)¯?Wìÿý¿Ñ+0I§¥õ*¶dçÓoG?ûU,hµÃ?Õ)???£ª¡?Xú??Ì|ld¢Ë8çAª???O¹ò.K£Ôj°éä/Ìä£MÅå?n¯I?cÅÅú½Öªb2k?LÊ??~g2ò³?Q ½zlHºÄAj¬RcË 9Æ;H?CÆwzF°ØÏôuª?Vv`X??ßiôÚ’Oõºî?~?h4·2¦kÙÐì|iù³?ïå~?¾[ÓmQÙHãÞ¸÷øw/#ï¾ÄÀ í|pO?ãL8~dÂñ3??L8N3áø? ÇY&¿3áã\@rIT?K¤?\2Uäª?T¹ÄªÈ%WÅW+Щ9:i¯?[

I think this was a output to printer file.

Need to somehow eliminate what I am calling trash words. It does not need to be perfect. The plan is to mark documents with trash words not included in the index so the user has an easy means for manual review.

What I may end of doing is counting from a list of safe chars (a,b,c,…). Like it must have one safe char or more than 1/2 safe chars to keep. Like I want to keep Café. Trash words tend to be all trash. This is a trash word ª’_LLýú that happens to have some safe chars.

At this point I am evaluating the battle field.

The nature of the business is may intentionally get sent trash files.

In case anyone cares I went with

rSafeChar = new Regex(@"[-_'@A-Za-z0-9]");

Toying with safeCharCount > unsafeCharCount or safeCharCount >= unsafeCharCount

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T16:18:25+00:00Added an answer on June 10, 2026 at 4:18 pm

    To check what can be matched by \w one could use a string containing the whole ascii table and use the following regex :

    (?:(?<wmatch>\w)*(?<wnotmatch>[^\w]*))*
    

    The resulting groups should contain the list of characters matched and not matched by \w.

    Here is an example :

    private void TestMatch()
    {
      string ascii = "abcdef0934+_!1@_$14-195djsjfke1058446541";
      Regex r = new Regex(@"(?:(?<wmatch>\w)*(?<wnotmatch>[^\w]*))*");
      Match m = r.Match(ascii);
      if (m.Success)
      {
        string msg = "\\w matches :";
        foreach (Capture cap in m.Groups["wmatch"].Captures)
        {
          msg += cap.Value + ", ";
        }
        msg += Environment.NewLine + "\\w does not match: ";
        foreach (Capture cap in m.Groups["wnotmatch"].Captures)
        {
          msg += cap.Value + ", ";
        }
        MessageBox.Show(msg);
      }
    }
    

    Shows :

    \\w matches :a, b, c, d, e, f, 0, 9, 3, 4, _, 1, _, 1, 4, 1, 9, 5, d, j, s, j, f, k, e, 1, 0, 5, 8, 4, 4, 6, 5, 4, 1,  
    \\w does not match: +, !, @, $, -, "
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

For the .NET Micro Framework, is there a way to enumerate the files in
Is there a straightforward way to enumerate all visible network printers in .NET? Currently,
In ASP.NET MVC is there a way to enumerate the controllers through code and
On page load is there a way to enumerate all the nest user controls
In .NET is there any way to convert from three letter country codes (defined
In ASP.NET CheckBoxList is there any way to determine of a particular checkbox is
I have elements structured roughly like this: http://jsfiddle.net/zyySd/ Is there any way to achieve
Is there a way using the Net::SFTP Library in Ruby ( API Link )
Is there a way through the .Net aframework (or has someone written something similar)
Is there a way to debug a (.NET c#) console application inside of a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.