I am using the XML regex pattern to match my proxy URL. eg: Proxy

Question

0

Editorial Team

Asked: June 6, 20262026-06-06T08:47:16+00:00 2026-06-06T08:47:16+00:00

I am using the XML regex pattern to match my proxy URL. eg: Proxy

0

I am using the XML regex pattern to match my proxy URL.

eg: Proxy : ab-proxy-sample.company.com:8080

My requirement :

Should not begin with http:// OR https:// (Match the whole word)
Should accept any string + a port
Should accept even strings starting with ht

My current XML regex is : [^http://|https://].+:[0-9]+|

But its matching each letter instead of the whole word ?

Any help would be highly appreciated.
Thanks in advance !

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T08:47:18+00:00

As @arnep points out, you’re attempting to use a negated character class with alternation, which isn’t the way it works. Also, here is some information regarding lookaheads.

I’m sure someone else will post an answer you can copy and paste, but this is a useful opportunity to learn the basics of regex!

UPDATE:

I didn’t realize that you were using an engine that doesn’t support negative lookarounds. Without negative lookarounds, it’s nearly impossible to achieve what you’re trying to do.

Nearly 😉

Here is a “brute-force” combinatoric method of doing it:

(?:[^h]|h(?:[^t]|t(?:[^t]|t(?:[^p]|p(?:[^s:]|s(?:[^:]|:(?:[^\/]|\/(?:[^\/])))|:(?:[^\/]|\/(?:[^\/])))))))\S+:\d+

If the XML engine doesn’t support non-captured groups, i.e. (?: ... ) then use regular groups instead:
```
([^h]|h([^t]|t([^t]|t([^p]|p([^s:]|s([^:]|:([^\/]|\/([^\/])))|:([^\/]|\/([^\/])))))))\S+:\d+
```
If the XML engine doesn’t support characters classes like \S and \d then use [^ \t\r\n\p] and [0-9] instead.

Here is a running example: http://rubular.com/r/JnpCVgeLmL. Try changing the test string. You’ll see that…

    ab-proxy-sample.company.com:8080          # matches
    htab-proxy-sample.company.com:8080        # matches
    http://ab-proxy-sample.company.com:8080   # doesn't
    https://ab-proxy-sample.company.com:8080  # doesn't
    httpd://ab-proxy-sample.company.com:8080  # matches

Note that you do not need the ^ and $. I added these specifically for the Rubular demo, but apparently the XML engine assumes this condition (anchored-ness).

How does this work? It’s easier to understand if we break it up like this:

    ([^h] | h
    ([^t] | t
    ([^t] | t
    ([^p] | p
    ([^s:]| s ([^:]|:([^\/]|\/([^\/])))
          | :        ([^\/]|\/([^\/])))
    ))))
    \S+:\d+

The explanation:

If the first char isn’t an “h”, then great! (The string can’t possibly be “http://” or “https://”.)
If the first char is an “h” though, then:
1. If the second char isn’t a “t”, then great! (The string can’t possibly be “http://” or “https://”.)
2. If the second char is a “t” though, then:
  1. … isn’t “t”, great!
  2. … is “t”, then:
    1. … isn’t “p”, great!
    2. … is “p”, then:

Here, it gets tricky: now we encounter three branches.

If the fifth char isn’t an “s” nor a “:”, then great!
If the fifth char is an “s” though, then:
1. If the sixth char isn’t a “:”, then great!
2. If the sixth char is a “:” though, then:
  1. If the seventh char isn’t a “/”, then great!
  2. If the seventh char is a “/” though, then:
    1. If the eighth char isn’t a “/”, then great!
    2. Otherwise, fail! We found an “https://”.
If the fifth char is a “:” though, then:
1. If the sixth char isn’t a “/”, then great!
2. If the sixth char is a “/” though, then:
  1. If the seventh char isn’t a “/”, then great!
  2. Otherwise, fail! We found an “http://”.

And finally, if we’ve gotten this far, then we look for a string of non-whitespace characters, followed by a colon, followed by a string of digits.

I leave it to a smarter mathematician than myself to ponder whether all strings matchable using lookarounds can be “brute-forced” in such a way.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am using the XML regex pattern to match my proxy URL. eg: Proxy

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply