I need to match everything between ‘ [~ ‘ and ‘ ~] ‘ tags.

Question

0

Editorial Team

Asked: May 30, 20262026-05-30T12:34:11+00:00 2026-05-30T12:34:11+00:00

I need to match everything between ‘ [~ ‘ and ‘ ~] ‘ tags.

0

I need to match everything between ‘[~‘ and ‘~]‘ tags.

Tried to write a lot of regex patterns but couldn’t find correct one:

#\[~(.*)~]# – this returns everything between first occurrence of [~ and last occurrence of ~].
#\[~([^~]*)~]# – this works fine if there are no ~ symbol inside tags.

I understand that (.*) captures everything and ([^~]*) captures everything until it finds ~ character but I cant make it to capture everything until it finds ~] pair (any byte excepting ~] pair is possible inside tags including single ~ character). In other words, I dont know how to make negation against the pair of characters.

This is possible example:

Simple [example~]: [~here I can face both, ‘~’ and ‘]’ characters~] or another
example [~~~~~~[ABC]~~~~~~].

After preg_match_all() against regex I expect resulting array like this:

array(2) {
  [0]=>
  string(44) "here I can face both, '~' and "]" characters"
  [1]=>
  string(14) "~~~~~[ABC]~~~~~"
}

Note: Input string may contain binary data (00-FF).

Just to mention (for certain people here), I’ve already checked out all related Q/A + hundreds of Google search results.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-30T12:34:13+00:00

* is greedy, so it takes as much as it can. You can make it non-greedy (add a ?) which should solve your issue.

#\[~(.*?)~]#

The following website has a good description and explains it in more detail: Repetition with Star and Plus.

preg_match deals with binary strings pretty well, the . matches any character which reads as byte if you’re in the standard mode (non-utf8) – as you are.

Simplified example for explanation:

 aab ::  a*  -> aa

Matches first an empty string, then a, then aa and then aab does not match so the last match aa is taken and returned. As you can see the engine had first internally three valid matches: empty string, a and aa. The last one wins in greedy-mode.

 aab ::  a*? -> (empty string)

Is at first position. Needs 0 or more a non-greedy. First position is zero or more a, so matches an empty string and returns. The first one wins in non-greedy-mode.

For UTF-8 strings, use the u modifier (PCRE8): #.*#u – . matches any UTF-8 character (which can be one or more bytes).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to match everything between ‘ [~ ‘ and ‘ ~] ‘ tags.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply