I’m using RUBY ‘s regular expression to deal with text such as
${1:aaa|bbbb}
${233:aaa | bbbb | ccc ccccc }
${34: aaa | bbbb | cccccccc |d}
${343: aaa | bbbb | cccccccc |dddddd ddddddddd}
${3443:a aa|bbbb|cccccccc|d}
${353:aa a| b b b b | c c c c c c c c | dddddd}
I want to get the trimed text between each pipe line. For example, for the first line of my upper example, I want to get the result aaa and bbbb, for the second line, I want aaa, bbbb and ccc ccccc. Now I have wrote a piece of regular expression and a piece of ruby code to test it:
array = "${33:aaa|bbbb|cccccccc}".scan(/\$\{\s*(\d+)\s*:(\s*[^\|]+\s*)(?:\|(\s*[^\|]+\s*))+\}/)
puts array
Now my problem is the (?:\|(\s*[^\|]+\s*))+ part can’t create multiple groups. I don’t know how to solve this problem, because the number of text I need in each line is variable. Can anyone help?
When you repeat a capturing group in a regular expression, the capturing group only stores the text matched by its last iteration. If you need to capture multiple iterations, you’ll need to use more than one regex. (.NET is the only exception to this. Its
CaptureCollectionprovides the matches of all iterations of a capturing group.)In your case, you could do a search-and-replace to replace
^\d+:with nothing. That strips off the number and colon at the start of your string. Then callsplit()using the regex\s*\|\s*to split the string into the elements delimited by vertical bars.