I’m alright with basic regular expressions, but I get a bit lost around pos/neg look aheads/behinds.
I’m trying to pull the id # from this:
[keyword stuff=otherstuff id=123 morestuff=stuff]
There could be unlimited amounts of “stuff” before or after.
I’ve been using The Regex Coach to help debug what I’ve tried, but I’m not moving forward anymore…
So far I have this:
\[keyword (?:id=([0-9]+))?[^\]]*\]
Which takes care of any extra attributes after the id, but I can’t figure out how to ignore everything between keyword and id.
I know I can’t go [^id]*
I believe I need to use a negative lookahead like this (?!id)* but I guess since it’s zero-width, it doesn’t move forward from there.
This doesn’t work either:
\[keyword[A-z0-9 =]*(?!id)(?:id=([0-9]+))?[^\]]*\]
I’ve been looking all over for examples, but haven’t found any. Or perhaps I have, but they went so far over my head I didn’t even realize what they were.
Help!
Thanks.
EDIT:
It has to match [keyword stuff=otherstuff] as well, where id= doesn’t exist at all, so I have to have a 1 or 0 on the id # group. There are also other [otherkeywords id=32] which I do not want to match. The document needs to match multiple [keyword id=3] throughout the documents using preg_match_all.
No lookahead/behind required:
Added the ending ‘[^]]*]’ to check for a real tag end, could be unnecessary.
Edit: added the \b to id as otherwise it could match
[keyword you-dont-want-this-guid=123123-132123-123 id=123]