let’s assume I have a text file which acts as a simple database by

Question

Asked: June 10, 20262026-06-10T05:36:17+00:00 2026-06-10T05:36:17+00:00

let’s assume I have a text file which acts as a simple database by using the | symbol to delimit database columns:

|some text| 234| other field| bla| 1232|

I want to construct an regexp expression that will

check how many fields are present in each line (by counting the occurrence of the | symbol)
check which fields are empty (no text between two | symbols)
will return each fields value
will strip white space from around field value. But be careful, empty field should not be stripped!

here’s two examples to illustrate what I want:

line = |some text| 234| other field| bla| 1232|
output = my_regexp(line)
disp(output)
  'some text', '234', 'other field', 'bla', '1232'

now the same, but this time field 3 is empty:

line2 = |some text| 234|  | bla| 1232|
output = my_regexp(line)
disp(output)
  'some text', '234', '', 'bla', '1232'

I’ve tried the following

values = regexp(regexprep(line '[\s]', ''), '\|', 'split')

but unfortunately this solution does not

check how many | are present
does not preserve the field order of returned values, because an empty field is ignored
tell me what field is empty

I’ve never built a complex regexp rule and appreciate your input!

You must login to add an answer.

Need An Account,

Editorial Team · Answer 1 · 2026-06-10T05:36:19+00:00

Editorial Team

This can be done using line.split as follows:

values = [v.strip() for v in line.split("|")[1:-1]]
num_fields = len(values)
num_empty_fields = values.count("")

To get a list of indices of the empty fields:

indices_empty_fields = [i for i, f in enumerate(values) if f == ""]

To use regular expressions, replace the calculation of values with:

import re
values = re.split("\s*\|\s*", line)[1:-1]

The Archive Base Latest Questions