I’m matching URLs against a regular expression, testing if they reflect a “shutdown” command.
Here’s a URL that performs a shutdown:
/exec?debug=true&command=shutdown&f=0
Here’s another, legitimate but confusing URL that performs shutdown:
/exec?commando=yes&zcommand=34&command=shutdown&p
Now, I must ensure there’s only one command=… parameter and it is command=shutdown. Alternatively, I can live with ensuring the first command=… parameter is command=shutdown.
Here’s my test for the requested regular expression:
/exec?version=0.4&command=shutdown&out=JSON&zcommand=1
Should match
/exec?version=0.4&command=startup&out=JSON&zcommand=1&commando=shutdown
Should fail to match
/exec?command=shutdown&out=JSON
Should match
/exec?version=0.4&command=admin&out=JSON&zcommand=1&command=shutdown
Should fail to match
Here’s my baseline – a regular expression that passes the above tests – all but the last one:
^/exec?(.*\&)*command=shutdown(\&.*)*$
The problem is with the occurrence of more than one command=…, where the first one is not shutdown.
I tried using lookbehind:
^/exec?(.*\&)*(?<!(\&|\?)command=.*)command=shutdown(\&.*)*$
But I’m getting:
Look-behind group does not have an obvious maximum length near index 31
I even tried atomic grouping. To no avail. I can’t make the following expression NOT match:
/exec?version=0.4&command=admin&out=JSON&zcommand=1&command=shutdown
Can anyone help with a regular expression that passes all the tests?
Clarifications
I see I owe you some context.
My task is to configure a Filter that guards the entrance of all our system’s servlets, and verifies there’s an open HTTP session (in other words: that a successful Login has occurred). The filter also allows configuring which URLs do not require login.
Some exceptions are easy: /login does not need login. Calls to localhost do not need login.
But sometimes it gets complicated. Like the shutdown command that cannot require login while other commands can and should (the strange reason for that is out of the scope of my question).
Since it’s a security matter, I can’t allow users to merely append &command=shutdown to a URL and bypass the filter.
So I really need a regular expression, or otherwise I’ll need to redefine the configuration specs.
This tested (and fully commented) regex solution meets all your requirements:
The above regex matches any RFC3986 valid URI having any scheme, authority, path, query or fragment components, but it must have one (and only one) query
"command"variable whose value must be exactly, but case insensitively:"shutdown".A carefully crafted complex regex is perfectly fine (and maintainable) to use when written with proper indentation and commented steps (like shown above). (For more information on using regex to validate a URI, see my article: Regular Expression URI Validation)