Here is sample text(simplified from original): <start1> <name=4654> bla bla bla bla <tags=bla model=c>

Question

0

Asked: May 22, 20262026-05-22T18:34:35+00:00 2026-05-22T18:34:35+00:00

Here is sample text(simplified from original): <start1> <name=4654> bla bla bla bla <tags=bla model=c>

0

Here is sample text(simplified from original):

<start1>
<name="4654">
bla bla bla bla
<tags="bla" model="c">
bla bla bla bla
<start2>
<name="12346">
bla bla bla bla
<tags="bla" model="d">
bla bla bla bla
<start3>
<name="73535">
bla bla bla bla
<tags="bla" model="c">
<start4>
<name="546875">
bla bla bla bla
<tags="bla" model="c">
bla bla bla bla

Here is my regex(dot matches new line option is on)

name="([\d]+)".+?(?<!start)tags="([^"]+?)" model="c"

As you can see there are 4 blocks, but I need to match those with model=”c”. However .+? is capturing more than it needs. Puting negative lookbehind to suppress it did not work… Any idea how can I exclude block?

Update(to clarify what I want to achieve):

out of sample data I want to match following 3 blocks:

First match

<name="4654">
bla bla bla bla
<tags="bla" model="c">

Second match

<name="73535">
bla bla bla bla
<tags="bla" model="c">

Third match

<name="546875">
bla bla bla bla
<tags="bla" model="c">

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-22T18:34:36+00:00

~~Is it always in this format of (start, name, tags), (start, name, tags), and so on? If so, you can even do without the lookaround.~~

/<name="(\d+)"[^<]+?<tags="([^"]+?)" model="c">/s

That works because you know the next < you encounter will be for the immediately following tags label. Can we guarantee that’s the case, or do we need to be more general to allow for other labels in the mix?

Also, do you need to capture the text after <tags> and before the next <start>? If so, you could add a little extra to the end for that.

/<name="(\d+)"[^<]+?<tags="([^"]+?)" model="c">[^<]*(?!<start)/s

Okay, according to your comments, that’s not the case. Scratch that, then.

Update

Okay, how ’bout this then?

/<name="(\d+)"(?:(?!<start).)+<tags="([^"]+?)" model="c">/s

This actually uses a lookahead, not a lookbehind. A simple lookahead/lookbehind will only assert that a string occurs before or after a block of text, not within. By checking at every character with ((?!str).)+, you effectively ensure that “str” is not contained throughout the text.

It might look confusing that I’m using a lookahead to check for <start, whereas a lookbehind for start would look like (?<!start) instead of (?!<start).
Think (?!(<start)) versus (?<!(start)).

I added (?: ) just so it wouldn’t capture.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Here is sample text(simplified from original): <start1> <name=4654> bla bla bla bla <tags=bla model=c>

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply