I have the following code: /* record 863.content.en */ UPDATE language_def SET en='<html>blah blah

Question

0

Asked: June 14, 20262026-06-14T17:38:33+00:00 2026-06-14T17:38:33+00:00

I have the following code: /* record 863.content.en */ UPDATE language_def SET en='<html>blah blah

0

I have the following code:

/* record 863.content.en */
UPDATE language_def
SET en='<html>blah blah markup</html>'
WHERE page_id=863,
AND string_id='content';
/* record_end 863.content.en */

I would like to create an expression to match that statement where:

the data in between the periods of 863.content.en are variable BUT SPECIFIC (there will be many of these statements in a row)
the data in between the two comments is variable but NOT specific

This is what I have so far:

'[/*]\s*record\s*specific_number[.]specific_string1[.]specific_string2\s*[*/].*[/*]\s*record_end\s*specific_number[.]specific_string1[.]specific_string2\s*[*/]'

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-14T17:38:34+00:00

There are a few problems with your regex.

First of all, as FrankeTheKneeMan pointed out, you need delimiters. # is a good choice for HTML matches (the standard choice is / but that interferes with tags too often):

'#[/*]\s*record\s*specific_number[.]specific_string1[.]specific_string2\s*[*/].*[/*]\s*record_end\s*specific_number[.]specific_string1[.]specific_string2\s*[*/]#'

Now while [.] is a nice way of escaping a single character, it doesn’t work the same for [/*]. This is a character class, that matches either / or *. Same for [*/]. Use this instead:

'#/[*]\s*record\s*specific_number[.]specific_string1[.]specific_string2\s*[*]/.*/[*]\s*record_end\s*specific_number[.]specific_string1[.]specific_string2\s*[*]/#'

Now .* is the remaining problem. Actually there are too, one is critical, the other might not be. The first is that . does not match line breaks by default. You can change this by using the s (singleline) modifier. The second is, that * is greedy. Should a section appear twice in the string, you would get everything from the first corresponding /* record to the last corresponding /* record_end, even if there is unrelated stuff in between. Since your records seem to be very specific, I suppose this is not the case. But still it is generally good practice, to make the quantifier ungreedy, so that it consumes as little as possible. Here is your final regex string:

'#/[*]\s*record\s*specific_number[.]specific_string1[.]specific_string2\s*[*]/.*?/[*]\s*record_end\s*specific_number[.]specific_string1[.]specific_string2\s*[*]/#s'

For your presented example, this is

'#/[*]\s*record\s*863[.]content[.]en\s*[*]/.*?/[*]\s*record_end\s*863[.]content[.]en\s*[*]/#s'

If you want to find all of these sections, then you can make 863, content and en variable, capture them (using parentheses) and use a backreference to make sure you get the corresponding record_end:

'#/[*]\s*record\s*(\d+)[.](\w+)[.](\w+)\s*[*]/.*?/[*]\s*record_end\s*\1[.]\2[.]\3\s*[*]/#s'

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have the following code: /* record 863.content.en */ UPDATE language_def SET en='<html>blah blah

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply