The text string I want to validate consists of what I call “segments”. A single segment might look like this:
[A-Z,S,3]
So far I managed to build this regex pattern
(?:\[(?<segment>[^,\]\[}' ]+?,[S|D],\d{1})\])+?
it works but it will return matches even though the whole text string contains invalid text. I guess I need to use ^ and $ somewhere in my pattern but I can’t figure out how!?
I would like my pattern to produce the following results:
[A-Z,S,3][A-Za-z0-9åäöÅÄÖ,D,4]OK(two segments)[A-Z,S,3]aaaa[A-Za-z0-9åäöÅÄÖ,D,4]No matchcrap[A-Z,S,3][A-Za-z0-9åäöÅÄÖ,D,4]No match[A-Z,S,3][]No match[A-Z,S,3][klm,D,4][0-9,S,1]OK(three segments)
Use ^ to anchor the start and $ to anchor the end. E.g.:
^(abc)*$, this matches zero or more repetitions of the group (“abc” in this example) and that must start at the start of the input string and end at the end of it.^(?:[(?[^,][}' ]+?,[S|D],\d{1})])+$—using an ungreedy+?doesn’t matter, as you require it to match until the end anyway. However, your regex has a few issues.^(?:\[[^,]+,[SD],\d\])+$—seems more like what you want.[^,]+,will match any sequence of non-commas followed by a comma, and in fact you should probably add]to this negated character class.[S|D]is a character class of three characters, as|doesn’t mean alternation here ((S|D)would mean the same as[SD]though).{1}is the default for any atom, you don’t need to specify it.Pseudocode (run it at codepad.org):
The big difference here is the expression matches only the complete
[...]part, but it is applied in succession, so they must start again where the last ends (or end at the end of the string).