I’m using NSXMLParser to parse an rss feed. But I’m getting some strange behavior that I believe I’ve narrowed down to stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet].
If I have a sentence like this:
Hello, my name is “Sonny.”
It will end up getting displayed like this:
Hello, my name is”Sonny.”
Here is my foundCharacters method:
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
if(!currentNodeContent)
currentNodeContent = [[NSMutableString alloc] initWithString:string];
else
{
[currentNodeContent appendString:string];
NSString *trimmedString = currentNodeContent;
trimmedString = [trimmedString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
[currentNodeContent setString:trimmedString];
}
}
I tried changing whitespaceAndNewlineCharacterSet to newlineCharacterSet, which fixed the problem but caused all kinds of unwanted whitespace and carriage returns to show up. Any thoughts on why this is happening and what I can do to fix it?
UPDATE
So I updated my code based on Dirk’s answer below, this seems to have done the trick nicely.
- (void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementname namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
if ([elementname isEqualToString:@"item"])
{
[comments addObject:currentComment];
currentComment = nil;
}
NSString *trimmedString = [tempString stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
[currentNodeContent setString:trimmedString];
tempString = nil;
currentNodeContent = nil;
}
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
if(!currentNodeContent) {
currentNodeContent = [[NSMutableString alloc] initWithString:string];
tempString = [[NSMutableString alloc] init];
} else {
[tempString appendString:string];
}
}
In a situation like this:
you should not rely on receiving exactly the following sequence of events:
startElement“element”characterData“Some Content”endElement“element”It could just as well be (depending on interna of the parser like buffer size, etc.):
startElement“element”characterData“So”characterData“me Cont`characterData“ent”endElement“element”To be safe, you should simply store the characters received until the end-of-element event is seen, and only then apply the trimming operation on the result.
From the
NSXMLParserdocumentation: