I just gone through the concept Zero-Width Assertions from the documentation. And some quick

Question

0

Asked: June 17, 20262026-06-17T11:35:41+00:00 2026-06-17T11:35:41+00:00

I just gone through the concept Zero-Width Assertions from the documentation. And some quick

0

I just gone through the concept Zero-Width Assertions from the documentation. And some quick questions comes into my mind-

why such name Zero-Width Assertions?
How the Look-ahead and look-behind concept supports such
Zero-Width Assertions concept?
What such ?<=s,<!s,=s,<=s – 4 symbols are instructing inside the pattern? can you help me here to focus to understand what is actually going on

I also tried some tiny codes to understand the logic, but not that much confident with the output of those:

irb(main):001:0> "foresight".sub(/(?!s)ight/, 'ee')
=> "foresee"
irb(main):002:0> "foresight".sub(/(?=s)ight/, 'ee')
=> "foresight"
irb(main):003:0> "foresight".sub(/(?<=s)ight/, 'ee')
=> "foresee"
irb(main):004:0> "foresight".sub(/(?<!s)ight/, 'ee')
=> "foresight"

Can anyone help me here to understand?

EDIT

Here i have tried two snippets one with “Zero-Width Assertions” concepts as below:

irb(main):002:0> "foresight".sub(/(?!s)ight/, 'ee')
=> "foresee"

and the other is without “Zero-Width Assertions” concepts as below:

irb(main):003:0> "foresight".sub(/ight/, 'ee')
=> "foresee"

Both the above produces same output,now internally how the both regexp move by their own to produce output- could you help me to visualize?

Thanks

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T11:35:42+00:00

Regular expressions match from left to right, and move a sort of “cursor” along the string as they go. If your regex contains a regular character like a, this means: “if there’s a letter a in front of the cursor, move the cursor ahead one character, and keep going. Otherwise, something’s wrong; back up and try something else.” So you might say that a has a “width” of one character.

A “zero-width assertion” is just that: it asserts something about the string (i.e., doesn’t match if some condition doesn’t hold), but it doesn’t move the cursor forwards, because its “width” is zero.

You’re probably already familiar with some simpler zero-width assertions, like ^ and $. These match the start and end of a string. If the cursor isn’t at the start or end when it sees those symbols, the regex engine will fail, back up, and try something else. But they don’t actually move the cursor forwards, because they don’t match characters; they only check where the cursor is.

Lookahead and lookbehind work the same way. When the regex engine tries to match them, it checks around the cursor to see if the right pattern is ahead of or behind it, but in case of a match, it doesn’t move the cursor.

Consider:

/(?=foo)foo/.match 'foo'

This will match! The regex engine goes like this:

Start at the beginning of the string: |foo.
The first part of the regex is (?=foo). This means: only match if foo appears after the cursor. Does it? Well, yes, so we can proceed. But the cursor doesn’t move, because this is zero-width. We still have |foo.
Next is f. Is there an f in front of the cursor? Yes, so proceed, and move the cursor past the f: f|oo.
Next is o. Is there an o in front of the cursor? Yes, so proceed, and move the cursor past the o: fo|o.
Same thing again, bringing us to foo|.
We reached the end of the regex, and nothing failed, so the pattern matches.

On your four assertions in particular:

(?=...) is “lookahead”; it asserts that ... does appear after the cursor.
```
1.9.3p125 :002 > 'jump june'.gsub(/ju(?=m)/, 'slu')
 => "slump june" 
```
The “ju” in “jump” matches because an “m” comes next. But the “ju” in “june” doesn’t have an “m” next, so it’s left alone.

Since it doesn’t move the cursor, you have to be careful when putting anything after it. (?=a)b will never match anything, because it checks that the next character is a, then also checks that the same character is b, which is impossible.
(?<=...) is “lookbehind”; it asserts that ... does appear before the cursor.
```
1.9.3p125 :002 > 'four flour'.gsub(/(?<=f)our/, 'ive')
 => "five flour" 
```
The “our” in “four” matches because there’s an “f” immediately before it, but the “our” in “flour” has an “l” immediately before it so it doesn’t match.

Like above, you have to be careful with what you put before it. a(?<=b) will never match, because it checks that the next character is a, moves the cursor, then checks that the previous character was b.
(?!...) is “negative lookahead”; it asserts that ... does not appear after the cursor.
```
1.9.3p125 :003 > 'child children'.gsub(/child(?!ren)/, 'kid')
 => "kid children"
```
“child” matches, because what comes next is a space, not “ren”. “children” doesn’t.

This is probably the one I get the most use out of; finely controlling what can’t come next comes in handy.
(?<!...) is “negative lookbehind”; it asserts that ... does not appear before the cursor.
```
1.9.3p125 :004 > 'foot root'.gsub(/(?<!r)oot/, 'eet')
 => "feet root" 
```
The “oot” in “foot” is fine, since there’s no “r” before it. The “oot” in “root” clearly has an “r”.

As an additional restriction, most regex engines require that ... has a fixed length in this case. So you can’t use ?, +, *, or {n,m}.

You can also nest these and otherwise do all kinds of crazy things. I use them mainly for one-offs I know I’ll never have to maintain, so I don’t have any great examples of real-world applications handy; honestly, they’re weird enough that you should try to do what you want some other way first. 🙂

Afterthought: The syntax comes from Perl regular expressions, which used (? followed by various symbols for a lot of extended syntax because ? on its own is invalid. So <= doesn’t mean anything by itself; (?<= is one entire token, meaning “this is the start of a lookbehind”. It’s like how += and ++ are separate operators, even though they both start with +.

They’re easy to remember, though: = indicates looking forwards (or, really, “here”), < indicates looking backwards, and ! has its traditional meaning of “not”.

Regarding your later examples:

irb(main):002:0> "foresight".sub(/(?!s)ight/, 'ee')
=> "foresee"

irb(main):003:0> "foresight".sub(/ight/, 'ee')
=> "foresee"

Yes, these produce the same output. This is that tricky bit with using lookahead:

The regex engine has tried some things, but they haven’t worked, and now it’s at fores|ight.
It checks (?!s). Is the character after the cursor s? No, it’s i! So that part matches and the matching continues, but the cursor doesn’t move, and we still have fores|ight.
It checks ight. Does ight come after the cursor? Well, yes, it does, so move the cursor: foresight|.
We’re done!

The cursor moved over the substring ight, so that’s the full match, and that’s what gets replaced.

Doing (?!a)b is useless, since you’re saying: the next character must not be a, and it must be b. But that’s the same as just matching b!

This can be useful sometimes, but you need a more complex pattern: for example, (?!3)\d will match any digit that isn’t a 3.

This is what you want:

1.9.3p125 :001 > "foresight".sub(/(?<!s)ight/, 'ee')
 => "foresight"

This asserts that s doesn’t come before ight.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I just gone through the concept Zero-Width Assertions from the documentation. And some quick

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply