I’m reading from a government text file in which $ is used as the delimiter, but I don’t think delimiter character matters…
So this is expected:
'a$b$c$d'.split('$')
# => ["a", "b", "c", "d"]
In the datafiles I’m working with, the column headers row (the first line) are uniformly filled in, i.e. there is no empty header, as in:
'a$b$$d'
# or:
'a$b$c$'
However, each row may have consecutive trailing delimiters such as:
"w$x$$\r\n"
Usually, I read each line and chomp it. But this causes String#split to treat the final two delimiters as one column:
"w$x$$\r\n".chomp.split('$')
# => ["w", "x"]
Not doing the chomp gets me the desired result, though I should chomp the last element:
"w$x$$\r\n".split('$')
# => ["w", "x", "", "\r\n"]
So either I have to:
- chomp the line if the final non-newline characters are NOT consecutive delimiters
- preserve the newline, do the split, and then chomp the final element IF the final characters are consecutive delimiter
This seems really awkward…am I missing something here?
You need to pass a negative value as the second parameter to
split. This prevents it from suppressing trailing null fields:See the docs on
split.