I’m trying to build a parser that would be able to extract the data

Question

0

Editorial Team

Asked: May 23, 20262026-05-23T07:21:25+00:00 2026-05-23T07:21:25+00:00

I’m trying to build a parser that would be able to extract the data

0

I’m trying to build a parser that would be able to extract the data using regex.

I want to be able to match

Here is what I have right now:

(\w+)\s+('|")([^\2\\]*(\\.[^\2\\]*)*)\2\s*;

The ([^\2\\]*(\\.[^\2\\]*)*) part was taken from http://ad.hominem.org/log/2005/05/quoted_strings.php

Unfortunately I have two problems with this pattern.

First of all, I would like to be able to capture string which aren’t enclosed with single/double quotes.

Having print "hello world"; works but print foobar; doesnt’t work. I haven’t been able to make the backreference \2 optional at the end.

Furthermore, I don’t know if it’s just the way I enclosed the regex, but I can’t seem to be able to parse multiple instance of this pattern.

If i try the regex with print 'hello'; print 'foobar';, it would just return the first print 'hello'; part.

Thanks in advance for your help.

Edit

Here is a snippet of what I’m trying to parse:

listen          80;
server_name     domain.com *.domain.com;
rewrite ^       http://www.domain.com$request_uri? permanent;

I am trying to capture every action with their parameters. Basically I wan’t to be able to parse the NGINX configuration file: http://wiki.nginx.org/FullExample

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T07:21:25+00:00

A backreference doesn’t work in a character class [^\2] like that. It might be a multi-character string, and cannot be used there. You could work around that using a ((?!\2).)* construct. But it would really be simpler if you just simplified your match pattern.

The easiest approach here would be to list the three possible alternatives separately:

 /(\w+)\s+ (?: '([^']*)' |  "([^"]*)" | (\S+) ) \s*;/x

Obviously you would then have to fetch the results from the result sets [2], [3] or [4] manually.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to build a parser that would be able to extract the data

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply