I have the following regex: (?i:^TPI$|^TIP$|^IPT$|^ITP$|^PIT$|^PTI$|^IP$|^PI$|^TI$|^IT$|^PT$|^TP$|^T$|^P$|^I$) How can I simplify it? My regular expression

Question

0

Asked: May 25, 20262026-05-25T11:28:22+00:00 2026-05-25T11:28:22+00:00

I have the following regex: (?i:^TPI$|^TIP$|^IPT$|^ITP$|^PIT$|^PTI$|^IP$|^PI$|^TI$|^IT$|^PT$|^TP$|^T$|^P$|^I$) How can I simplify it? My regular expression

0

I have the following regex:

(?i:^TPI$|^TIP$|^IPT$|^ITP$|^PIT$|^PTI$|^IP$|^PI$|^TI$|^IT$|^PT$|^TP$|^T$|^P$|^I$)

How can I simplify it? My regular expression knowledge is rather limited.

My requirements are:

Acceptable inputs are “T”, “P”, and “I”
Values may come in any order
Only one of each value is accepted. “TTI” is invalid, but “TI” is valid
Case insensitive

I used

^(?i:[TPI]){1,3}$

in the past, and that mostly works. The only problem is it accepts multiple values “TTT” is acceptable with that regex, I need that to fail).

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-25T11:28:23+00:00

We can try in a different way. The attempt you made allows some strings to slip through which you don’t want. Namely, everything with repetitions. In the following I will experiment a bit with PowerShell to show the solution. First we need all possible strings we can expect as input:

$tests = 'TPI'[0..2]|%{$a=$_;"$a"; 'TPI'[0..2]|%{$b=$_;"$a$b"; 'TPI'[0..2]|%{"$a$b$_"}}} | sort

This yields the following sequence of values (I format them on a single line, but they come out one per line usually):

$tests
I II III IIP IIT IP IPI IPP IPT IT ITI ITP ITT P PI PII PIP PIT PP PPI PPP PPT PT PTI PTP PTT T TI TII TIP TIT TP TPI TPP TPT TT TTI TTP TTT

This is of course also what the regular expression

^(?i:[TPI]){1,3}$

will match.

We can restrict what we want to match by using a so-called negative lookahead assertion which will match only if some text is following but won’t actually match the text itself, thereby allowing it to be captured by the pattern you have above. This can be accomplished with (?!) where you would insert some sub-expression after the !. Let’s try and restrict to input that doesn’t start with two I, two P or two T:

$tests -match '^(?!II|PP|TT)(?i:[TPI]{1,3})$'
I IP IPI IPP IPT IT ITI ITP ITT P PI PII PIP PIT PT PTI PTP PTT T TI TII TIP TIT TP TPI TPP TPT

As you can see, those are gone from the results. We can simplify that if we use a capturing group and a backreference. Parentheses normally (except if they start with (?) capture what is matched inside them and you can use that after matching to extract parts from the match or for replacements. But you can also use it in the pattern itself in many regex engines (in fact, I think there is no engine that allows negative lookahead but not backreferences in the pattern). So II|PP|TT can be written as (.)\1 which just says “a letter, follows by exactly the same letter” since \1 is the backreference, pointing to whatever was matched by (.).

Now we still have a few values we don’t want, namely everything with two same letters in position 2 and 3 and those in position 1 and 3. We can get rid of the former with the following:

$tests -match '^(?!.?(.)\1)(?i:[TPI]{1,3})$'
I IP IPI IPT IT ITI ITP P PI PIP PIT PT PTI PTP T TI TIP TIT TP TPI TPT

The .? in the beginning now says “match a character or not” which therefore extends what we had before two exclude the matches with repetitions in the end. For the second set we just need to exclude matches that look like (.).\1, i.e. a letter, followed by another and then a repetition of the first. We can extend the regex above by just putting another .?, i.e. an optional letter between the capturing group and the backreference:

$tests -match '^(?!.?(.).?\1)(?i:[TPI]{1,3})$'
I IP IPT IT ITP P PI PIT PT PTI T TI TIP TP TPI

Which now is exactly the set you wanted to represent. The final regex is

^(?!.?(.).?\1)(?i:[TPI]{1,3})$

It’s shorter than before, that’s for sure. Whether it’s simpler might be up for debate, as it might need some explanation what it does. This probably is even more the case for the more compressed approach in the other answer. It’s shorter, indeed, but this being my answer and we contend for votes I just have to say that I dislike it 😉 … just kidding. But for such things I guess separating the basic pattern from exclusions does indeed make sense for readability.

Another option might be to validate the basic pattern with regex, i.e. your initial approach. And then use code to reject duplicates which might look something like

($s.ToLowerInvariant().ToCharArray() | select -Unique).Count -eq $s.Length

depending on your language – provided it makes those things easy and readable.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have the following regex: (?i:^TPI$|^TIP$|^IPT$|^ITP$|^PIT$|^PTI$|^IP$|^PI$|^TI$|^IT$|^PT$|^TP$|^T$|^P$|^I$) How can I simplify it? My regular expression

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply