Basically I’m taking in a paragraph filled with all kinds of punctuation
such as ! ? . ; ” and splitting them into sentences.
The issues I’m facing is coming up with a way to split them into sentences with punctuation intact while at the same time accounting for quotations in dialogue
For instance the paragraph:
One morning, when Gregor Samsa woke from troubled dreams, he found
himself transformed in his bed into a horrible vermin. “What has
happened!?” he asked himself. “I… don’t know.” said Samsa, “Maybe
this is a bad dream.” He lay on his armour-like back, and if he lifted
his head a little he could see his brown belly, slightly domed and
divided by arches into stiff sections.
Would need to be split up like this
[0] One morning, when Gregor Samsa woke from troubled dreams, he found himself transformed in his bed into a horrible vermin.
[1] "What has happened!?" he asked himself.
[2] "I... don't know." said Samsa, "Maybe this is a bad dream."
And so on.
Currently I am just using explode
$sentences = explode(".", $sourceWork);
and only splitting it up by the periods and appending one at the end. Which I know is far from what I want but I’m not quite sure where to even start handling this. If someone could at least point me the right direction of where to look for ideas that would be amazing.
Thanks in advance!
Here’s what I have: