I am looking for an algorithm that when given a text will cut it to sentences smartly, anything could help. For now I have an algorithm that works with the number of words per sentence that I specify. I could change it to work till the first ‘.’ and stuff like that but what I need is an algorithm that can do it somewhat logically (won’t leave sentences that end on ‘is’ and ‘and, and maybe look for other punctuation marks besides ‘.’).
Any ideas?
I am using PHP5.
Use this code with preg_split:
It splits your sentence in an array of sentences, you have to choose which letter you want to use (in the example above I used “,.:;”).
It uses regular expressions, it is very useful 😉