I want to slice a Unicode string in Ruby.
The slicing should keep the invisible characters intact.
Here’s an example of the input:
Foo\r\n
\r\n
\r\n
Bär 1.234 Foo test\r\n
blub
Which should become:
Array=["Foo\r\n\r\n\r\n","Bär","1.234,"Foo","test\r\n","blub"]
Basically I want to tokenize the string and keep the formatting intact.
When I do something like:
String.split(/ /)
I end up with something like:
Array=["Foo\r\n\r\n\r\nBär","1.234","Foo"]
And, something like:
String.split(/\W/)
kills the formatting.
Instead of using
split, usescanfor the first part:Then conditionally apply your split like this:
or: