I’m trying to build a regexp in ruby to match alpha characters in UTF-8

Question

0

Asked: June 14, 20262026-06-14T18:02:09+00:00 2026-06-14T18:02:09+00:00

I’m trying to build a regexp in ruby to match alpha characters in UTF-8

0

I’m trying to build a regexp in ruby to match alpha characters in UTF-8 like ñíóúü, etc. I know /\p{Alpha}/i works and /\p{L}/i works too but what’s the difference?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-14T18:02:11+00:00

They seem to be equivalent. (Edit: sometimes, see the end of this answer)

It seems like Ruby supports \p{Alpha} since version 1.9. In POSIX \p{Alpha} is equal to \p{L&} (for regular expressions with Unicode support; see here). This matches all characters that have an upper and lower case variant (see here). Unicase letters would not be matched (while they would be match by \p{L}.

This does not seem to be true for Ruby (I picked a random Arabic character, since Arabic has a unicase alphabet):

\p{L} (any letter) matches.
Case-sensitive classes \p{Lu}, \p{Ll}, \p{Lt} don’t match. As expected.
p{L&} doesn’t match. As expected.
\p{Alpha} matches.

Which seems to be a very good indication that \p{Alpha} is just an alias for \p{L} in Ruby. On Rubular you can also see that \p{Alpha} was not available in Ruby 1.8.7.

Note that the i modifier is irrelevant in any case, because both \p{Alpha} and \p{L} match both upper- and lower-case characters anyway.

EDIT:

A ha, there is a difference! I just found this PDF about Ruby’s new regex engine (in use as of Ruby 1.9 as stated above). \p{Alpha} is available regardless of encoding (and will probably just match [A-Za-z] if there is no Unicode support), while \p{L} is specifically a Unicode property. That means, \p{Alpha} behaves exactly as in POSIX regexes, with the difference that here is corresponds to \p{L}, but in POSIX it corresponds to \p{L&}.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to build a regexp in ruby to match alpha characters in UTF-8

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply