I have to extract summary from newspaper article . The summary is extracted based on given Keyword and according to below mentioned rules .
-
Summary should be of 200 characters.
-
Start printing from that sentence in article as soon as keyword
appears in that sentence and print upto 200 characters -
If the matching sentence occurs towards ending of article such that
summary is coming out to be less than 200 characters , then move
back from matching sentence towards previous sentences uptill
finally 200 charcters containing matching sentence are printed
finally.
What I have done untill now is …
var regex = new Regex(keyword+@"(.{0,200})");
foreach (Match match in regex.Matches(input))
{
var result = match.Groups[1].Value;
Console.WriteLine(result);
// work with the result
}
The above code successfully reaches the first matching sentence but starts printing AFTER the keyword upto 200 characters rather than beginning of matching sentence.
Also there is no backtracking if end of article is reached before 200 characters are printed.
Please guide me how should I proceed . Even if somebody doesn’t know complete solution , PLEASE do help me out in sub parts of question .
And if you want to search to be case insensitive, just add
StringComparison.OrdinalIgnoreCaseas another parameter in bothIndexOfs.