I need to match a string (any character or symbol other than space) with length of 10 and at least a number (but uncertain location) in it. What’s the easiest way to do it? Thx! (preferably in Perl Regex but really any regex would shed light on it.)
Some sample strings that meet the requirement:
ABCD1EFGH2
AGD-D.D8HD
1414151502
[TT]88daJh
Some samples that do NOT meet the requirement:
ABCDEFGHIJ # no digit
EGEGE_(**/ # no digit
asdgja8G # too short
@#21-GDKJGDE # too long
Thx!
UPDATE: To be clear, this is a search. The input string has thousands of characters long. I need to match out all the 10-character “words” that have a digit in them. You can think of a string that contains all above 8 words separated by space(s) and tab(s). Would like a search that picks out the first 4 only. Thx!
UPDATE of UPDATE: Apologize for not being clear again (wanted to simplify the case, but failed to exclude alternative interpretations). The usage for this regex search would be part of a longer match. Eg. A 10-char word with at least a digit followed by a 4-char word, etc… So splitting the long string as the first step wouldn’t quite work.
That was a very important clarification; finding the kind of strings you describe within a larger string is a very different task from matching standalone strings, and much more complicated. I think the easiest way to do it is with lookarounds:
(?<!\S)matches a position that is not preceded by a non-whitespace character.(?=\S{10}(?!\S))further asserts that the position is followed by exactly 10 non-whitespace characters.Once the lookarounds are satisfied,
\S*\d\S*goes ahead and consumes the string, assuming at least one of the characters is a digit.This will work in Perl and most of the Perl-derived flavors, like Python, Java, and .NET, but not in JavaScript, which doesn’t support lookbehinds.
EDIT: Here’s an example showing how to iterate through all matches in Perl:
…and here’s a live demo (which also includes the optimization discussed in the comments).
In JavaScript I’d use a slightly different regex:
Replacing the lookbehind with
(?:\s|^)means I’m consuming the leading whitespace character now. To extract the word alone, I capture it with()and retrieve it withmatch[1]. demo