When I use scan(/\p{graph}+/) it works:
"ich gehe nach Köln am 12.09.2012".scan(/\p{graph}+/)
=> ["ich", "gehe", "nach", "Köln", "am", "12.09.2012"]
But if there is a typing error like “Köln.am“, then the output is wrong:
"ich gehe nach Köln.am 12.09.2012".scan(/\p{graph}+/)
=> ["ich", "gehe", "nach", "Köln.am", "12.09.2012"]
When I use scan(/\p{alnum}+/), the Date is not correct:
"ich gehe nach Köln.am 12.09.2012".scan(/\p{alnum}+/)
=> ["ich", "gehe", "nach", "Köln", "am", "12", "09", "2012"]
Does anyone know another solution?
For this simple case you can check using alternations and match either a series of letters or a series of digits with dots.
outputs:
or, if you don’t want to match the single dot
output: