I have a string like that :
group1
Members:
m/a
m/b
group2
Members:
m/c
m/d
m/e
group3
No Members
I want to have a scan result like :
[["group1","a","b"],["group2","c","d","e"],["group3"]]
But I just can have :
[["group1","a"],["group2","c"],["group3", nil]]
with this regexp :
text.scan(/([^\r\n]+)\r?\n[\s\t]*(?:No |)Members[\s:]*\r?\n(?:[\t\s]*m\/(\w+)+\r?\n)*/m)
Can I do what I want only with regexp ?
While it’s possible to do it in a regex, it gets unwieldy, so I’d do it this way:
Cleaning up the rest to meet the question’s requirements makes it:
The gist of the parsing is really in
slice_before. Everything else is creating the array and cleanup.Breaking it down:
gsub(%r{\bm/}, '')strips the undesiredm/.split(/\n\s*/)splits the string on line-ends into an array, while simultaneously removing the leading white-space.reject{ |s| s[/\bMembers\b/] }rejects any lines containing ‘Members’ as a separate word.slice_before(/^group/)breaks up the array into chunks starting with ‘group’ at the beginning of the string.to_aconverts it all into an array again.