I have few text(SMS) messages and I want to segment them using period(‘.’) as a delimiter. I am unable to handle following types of messages. How can I segment these messages using Regex in Python.
Before segmentation:
'hyper count 16.8mmol/l.plz review b4 5pm.just to inform u.thank u' 'no of beds 8.please inform person in-charge.tq'
After segmentation:
'hyper count 16.8mmol/l' 'plz review b4 5pm' 'just to inform u' 'thank u' 'no of beds 8' 'please inform person in-charge' 'tq'
Each line is a separate message
Updated:
I am doing natural language processing and I feel its okay to treat '16.8mmmol/l' and 'no of beds 8.2 cups of tea.' as same. 80% accuracy is enough for me but I want to reduce False Positive as much as possible.
What about
The lookarounds ensure that either on one or the other side is not a digit. So this covers also the
16.8case. This expression will not split if there are on both sides digits.