I want to write a regular expression which matches anything between
()
(())
(()())
((()))
()()()
etc.
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
All these answers claiming you can’t use patterns to match a string with balanced nested parens are quite wrong. It’s not practical to pretend that the patterns matched by modern programming languages are restricted to “regular languages” in the pathological textbook sense. As soon as you permit backreferences, they’re not. This allows real-world patterns to match much more than the textbook versions, making them far more practical.
The simplest pattern for matching balanced parens is
\((?:[^()]*+|(?0))*\). But you should never write that, because it is too compact to be easily read. You should always write it with/xmode to allow for whitespace and comments. So write it like this:There’s also a lot to be said for naming your abstractions, and decoupling their definition and its ordering from their execution. That leads to this sort of thing:
The second form is now amenable to inclusion in larger patterns.
Don’t ever let anybody tell you can’t use a pattern to match something that’s recursively defined. As I’ve just demonstrated, you most certainly can.
While you’re at it, make sure never to write line-noise patterns. You don’t have to, and you shouldn’t. No programming language can be maintainable that forbids white space, comments, subroutines, or alphanumeric identifiers. So use all those things in your patterns.
Of course, it does help to pick the right language for this kind of work. ☺