Consider this text:
Paragraph 1: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Paragraph 2 Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Paragraph 3 Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
In ObjC, when reading the above text, there are two \n\n line spaces between paragraph1 and paragraph2. But there are more than 3 line spaces \n\n\n\n between paragraph2 and paragraph3.
I wanted to have an NSRegularExpression pattern that would read and return those paragraphs completely disregarding the number of linespaces.
NSString *pattern = @"\n(*\n)\n";
NSRegularExpression* regex1 = [[NSRegularExpression alloc] initWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:nil];
NSArray *array = [regex1 matchesInString:p options:0 range:NSMakeRange(0, [p length])];
for(NSTextCheckingResult *tcr in array){
NSTextCheckingResult *tcr = [regex1 firstMatchInString:p options:0 range:NSMakeRange(0, p.length)];
NSRange matchRange = [tcr rangeAtIndex:1];
NSString *amatch = [p substringWithRange:matchRange];
NSLog(@"Found string: %@", amatch);
}
I’m new to NSRegularExpression, any reference to a better tutorial would be great. In this case and, is this the right way to go about it in the above question.
The following does the job. I also used the
enumerateMatchesInStringto find matches.This returns not only the strings between two newline characters (ignoring any extra whitespace between the returns), but also the first one (i.e. between the beginning of the string and the first sequence of two newlines) and the last one (i.e. between the last sequence of two newlines and the end of the string.