Given a string of pipe-separated values (call it $psv), I want to be able to split by those pipes and populate an array. However, the string can also contain escaped pipes (\|) and escaped escapes (\\), both of which are to be considered mere literals. I have a couple solutions for this problem in mind:
- Replace both escape sequences with some random strings not-otherwise found in the
$psv,split(/\|/, $psv), replace back original characters - Loop through
$psv, character-by-character
And I think both of those would work. But for a maximum dopamine flood, I’d like to just do this with a single split() call and nothing else. So is there a regular expression for this?
If Perl supported variable-width look-behind assertions, you might be able to do it with something like this:
That should match a pipe character which is not preceded by (an odd number of backslashes not preceded by a backslash). But only fixed-width look-behind assertions are allowed, so that’s not an option. It’s possible that some regex guru could come up with something that would actually work for you, but personally I’d say a finite state machine (looping through
$psva character at a time) might be a better option.Something else I suppose you could try is to just split the string on the pipe character, and then check each element of the resulting list to see if it ends with an odd number of backslashes. If it does, join it back to the next element of the list with
|between them. Basically you’d be doing the split ignoring the escape sequences, then going back and accounting for the escapes afterwards.