Given a string, what is the most efficient way to return an array of the character position of the beginning of newlines in the string?
text =<<_
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.
_
Expected:
find_newlines(text) # => [0, 80, 155, 233, 313, 393]
I post my own answers. I would like to accept the fastest way as the accepted answer.
Benchmark result here will be updated when a new answer is added
require "fruity"
compare do
padde1 {find_newlines_padde1(text)}
digitalross1 {find_newlines_digitalross1(text)}
sawa1 {find_newlines1(text)}
sawa2 {find_newlines2(text)}
end
# Running each test 512 times. Test will take about 1 second.
# digitalross1 is faster than sawa2 by 5x ± 0.1
# sawa2 is faster than sawa1 by 21.999999999999996% ± 1.0%
# sawa1 is faster than padde1 by 4.0000000000000036% ± 1.0%
As noted, use
text.each_line.to_afor 1.9. Callingeach_linealso works in 1.8.7, but it’s 20% slower than calling onlyto_a.