I have some text like:
DESC:manner How did serfdom develop in and then leave Russia ? ENTY:cremat What films featured the character Popeye Doyle ? DESC:manner How can I find a list of celebrities ' real names ?
I read them line by line and I want to convert each line to a string Array word by word .
like this:
Array = [DESC, :, manner, How, did, serfdom ,develop, in ,and ,then ,leave, Russia ,?]
The problem is that you want to keep some delimiters and not others (keep
:and loose the spaces). I think you need a regular expresion to accomplish this. Something like this should do it:This uses the Lookahead and Lookbehind RegEx to find/keep the delimiter
:while we added in a normal split for the space( )to trash those.After this arr should be: