let’s assume I have a text file which acts as a simple database by using the | symbol to delimit database columns:
|some text| 234| other field| bla| 1232|
I want to construct an regexp expression that will
- check how many fields are present in each line (by counting the occurrence of the
|symbol) - check which fields are empty (no text between two
|symbols) - will return each fields value
- will strip white space from around field value. But be careful, empty field should not be stripped!
here’s two examples to illustrate what I want:
line = |some text| 234| other field| bla| 1232|
output = my_regexp(line)
disp(output)
'some text', '234', 'other field', 'bla', '1232'
now the same, but this time field 3 is empty:
line2 = |some text| 234| | bla| 1232|
output = my_regexp(line)
disp(output)
'some text', '234', '', 'bla', '1232'
I’ve tried the following
values = regexp(regexprep(line '[\s]', ''), '\|', 'split')
but unfortunately this solution does not
- check how many
|are present - does not preserve the field order of returned values, because an empty field is ignored
- tell me what field is empty
I’ve never built a complex regexp rule and appreciate your input!
This can be done using
line.splitas follows:To get a list of indices of the empty fields:
To use regular expressions, replace the calculation of
valueswith: