How do I parse sentence case phrases from a passage.
For example from this passage
Conan Doyle said that the character of Holmes was inspired by Dr. Joseph Bell, for whom Doyle had worked as a clerk at the Edinburgh Royal Infirmary. Like Holmes, Bell was noted for drawing large conclusions from the smallest observations.[1] Michael Harrison argued in a 1971 article in Ellery Queen’s Mystery Magazine that the character was inspired by Wendell Scherer, a “consulting detective” in a murder case that allegedly received a great deal of newspaper attention in England in 1882.
We need to generate stuff like Conan Doyle, Holmes, Dr Joseph Bell, Wendell Scherr etc.
I would prefer a Pythonic Solution if possible
This kind of processing can be very tricky. This simple code does almost the right thing:
produces:
To include “Dr. Joseph Bell”, you need to be ok with the period in the string, which allows in “Edinburgh Royal Infirmary. Like Holmes”.
I had a similar problem: Separating Sentences.