Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7860941
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 2, 20262026-06-02T22:29:57+00:00 2026-06-02T22:29:57+00:00

I can search a CJK char (such as 小 ) by using a unicode

  • 0

I can search a CJK char (such as 小) by using a unicode code point:

/\%u5c0f
/[\u5c0f]

I cannot search all of CJK chars by using [\u4E00-\u9FFF], because vim manual says:

:help /[]
NOTE: The other backslash codes mentioned above do not work inside []!

Is these a way to do the job?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-02T22:30:00+00:00Added an answer on June 2, 2026 at 10:30 pm

    Vim cannot actually do this by itself, since you aren’t given access to Unicode properties like \p{Han}.

    As of Unicode v6.0, the range of codepoints for characters in the Han script is:

    2E80-2E99 2E9B-2EF3 2F00-2FD5 3005-3005 3007-3007 3021-3029 3038-303B 3400-4DB5 4E00-9FCB F900-FA2D FA30-FA6D FA70-FAD9 20000-2A6D6 2A700-2B734 2B740-2B81D 2F800-2FA1D 
    

    Whereas with Unicode v6.1, the range of Han codepoints has changed to:

    2E80-2E99 2E9B-2EF3 2F00-2FD5 3005-3005 3007-3007 3021-3029 3038-303B 3400-4DB5 4E00-9FCC F900-FA6D FA70-FAD9 20000-2A6D6 2A700-2B734 2B740-2B81D 2F800-2FA1D 
    

    I also seem to recall that Vim has difficulties expressing astral code points, which are needed for this to work correctly. For example, using the flexible \x{HHHHHH} notation from Java 7 or Perl, you would have:

    [\x{2E80}-\x{2E99}\x{2E9B}-\x{2EF3}\x{2F00}-\x{2FD5}\x{3005}-\x{3005}\x{3007}-\x{3007}\x{3021}-\x{3029}\x{3038}-\x{303B}\x{3400}-\x{4DB5}\x{4E00}-\x{9FCC}\x{F900}-\x{FA6D}\x{FA70}-\x{FAD9}\x{20000}-\x{2A6D6}\x{2A700}-\x{2B734}\x{2B740}-\x{2B81D}\x{2F800}-\x{2FA1D}]
    

    Notice that the last part of the range is \x{2F800}-\x{2FA1D}, which is beyond the BMP. But what you really need is \p{Han} (meaning, \p{Script=Han}). This again shows that regex dialects that don’t support at least Level 1 of UTS#18: Basic Unicode Support are inadequate for working with Unicode. Vim’s regexes are inadequate for basic Unicode work.


    EDITED TO ADD

    Here’s the program that dumps out the ranges of code points that apply to any given Unicode script.

    #!/usr/bin/env perl
    #
    # uniscrange - given a Unicode script name, print out the ranges of code 
    #              points that apply.
    # Tom Christiansen <tchrist@perl.com>
    
    use strict;
    use warnings;
    
    use Unicode::UCD qw(charscript);
    
    for my $arg (@ARGV) {
        print "$arg: " if @ARGV > 1;
        dump_range($arg);
    }
    
    sub dump_range {
        my($scriptname) = @_;
    
        my $alist = charscript($scriptname);
        unless ($alist) {
            warn "Unknown script '$scriptname'\n";
            return;
        }
    
        for my $aref (@$alist) {
            my($start, $stop, $name) = @$aref;
            die "got $name, not $scriptname\n" unless $name eq $scriptname;
            printf "%04X-%04X ", $start, $stop;
        }
        print "\n";
    
    }
    

    Its answers depend on which version of Perl — and thus, which version of Unicode — you’re running it against.

    $ perl5.8.8 ~/uniscrange Latin Greek
    Latin: 0041-005A 0061-007A 00AA-00AA 00BA-00BA 00C0-00D6 00D8-00F6 00F8-01BA 01BB-01BB 01BC-01BF 01C0-01C3 01C4-0241 0250-02AF 02B0-02B8 02E0-02E4 1D00-1D25 1D2C-1D5C 1D62-1D65 1D6B-1D77 1D79-1D9A 1D9B-1DBF 1E00-1E9B 1EA0-1EF9 2071-2071 207F-207F 2090-2094 212A-212B FB00-FB06 FF21-FF3A FF41-FF5A 
    Greek: 0374-0375 037A-037A 0384-0385 0386-0386 0388-038A 038C-038C 038E-03A1 03A3-03CE 03D0-03E1 03F0-03F5 03F6-03F6 03F7-03FF 1D26-1D2A 1D5D-1D61 1D66-1D6A 1F00-1F15 1F18-1F1D 1F20-1F45 1F48-1F4D 1F50-1F57 1F59-1F59 1F5B-1F5B 1F5D-1F5D 1F5F-1F7D 1F80-1FB4 1FB6-1FBC 1FBD-1FBD 1FBE-1FBE 1FBF-1FC1 1FC2-1FC4 1FC6-1FCC 1FCD-1FCF 1FD0-1FD3 1FD6-1FDB 1FDD-1FDF 1FE0-1FEC 1FED-1FEF 1FF2-1FF4 1FF6-1FFC 1FFD-1FFE 2126-2126 10140-10174 10175-10178 10179-10189 1018A-1018A 1D200-1D241 1D242-1D244 1D245-1D245
    
    $ perl5.10.0 ~/uniscrange Latin Greek
    Latin: 0041-005A 0061-007A 00AA-00AA 00BA-00BA 00C0-00D6 00D8-00F6 00F8-01BA 01BB-01BB 01BC-01BF 01C0-01C3 01C4-0293 0294-0294 0295-02AF 02B0-02B8 02E0-02E4 1D00-1D25 1D2C-1D5C 1D62-1D65 1D6B-1D77 1D79-1D9A 1D9B-1DBE 1E00-1E9B 1EA0-1EF9 2071-2071 207F-207F 2090-2094 212A-212B 2132-2132 214E-214E 2184-2184 2C60-2C6C 2C74-2C77 FB00-FB06 FF21-FF3A FF41-FF5A 
    Greek: 0374-0375 037A-037A 037B-037D 0384-0385 0386-0386 0388-038A 038C-038C 038E-03A1 03A3-03CE 03D0-03E1 03F0-03F5 03F6-03F6 03F7-03FF 1D26-1D2A 1D5D-1D61 1D66-1D6A 1DBF-1DBF 1F00-1F15 1F18-1F1D 1F20-1F45 1F48-1F4D 1F50-1F57 1F59-1F59 1F5B-1F5B 1F5D-1F5D 1F5F-1F7D 1F80-1FB4 1FB6-1FBC 1FBD-1FBD 1FBE-1FBE 1FBF-1FC1 1FC2-1FC4 1FC6-1FCC 1FCD-1FCF 1FD0-1FD3 1FD6-1FDB 1FDD-1FDF 1FE0-1FEC 1FED-1FEF 1FF2-1FF4 1FF6-1FFC 1FFD-1FFE 2126-2126 10140-10174 10175-10178 10179-10189 1018A-1018A 1D200-1D241 1D242-1D244 1D245-1D245
    

    You can use the corelist -a Unicode command to see which version of Unicode goes with which version of Perl. Here is selected output:

    $ corelist -a Unicode
      v5.8.8     4.1.0     
      v5.10.0    5.0.0     
      v5.12.2    5.2.0     
      v5.14.0    6.0.0     
      v5.16.0    6.1.0     
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Can search engines such as Google index JavaScript generated web pages? When you right
I'm using a MySQL database. I have a products database where users can search
Currently users can search a database using php and ajax with the results shown
I know we can search posted reviews on Yelp using Yelp Review Search API.
With the Find function( Ctrl+F ) I can search and select all words in
Hi I can search all the images with .jpg extension and pass it to
All the examples that I can search online use the App.Config mode of specifying
I am wondering how one can search for all primitive float values that match
I'm confused as to why I can search as an Administrator but users cannot.
I was wondering if anyone can point me to a library that can search

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.