Possible Duplicate:
PHP – How to split a paragraph into sentences.
I have a block of text that I would like to separate into sentences, what would be the best way of doing this? I thought of looking for ‘.’,’!’,’?’ characters, but I realized there were some problems with this, such as when people use acronyms, or end a sentence with something like !?. What would be the best way to handle this? I figured there would be some regex that could handle this, but I’m open to a non-regex solution if that fits the problem better.
Regex isn’t the best solution for this problem. You’d be served better by creating a parsing library. Something where you an easily create logic blocks to distinguish one thing from another. You’ll need to come up with a set of rules breaking up the text into the chunks you’d like to see.
Doesn’t that mess things up when using regex? However, with a parser you could actually see
that with simple rules could say “that’s one sentence.”