I have this code:
- (void)parser:(NSXMLParser *)parser foundCDATA:(NSData *)CDATABlock
{
NSString *someString = [[NSString alloc] initWithData:CDATABlock encoding:NSUTF8StringEncoding];
someString = [ someString stringByReplacingOccurrencesOfString:@"%" withString: @"&" ];
someString = [ someString stringByReplacingOccurrencesOfString:@"|" withString: @"|" ];
someString = [ someString stringByReplacingOccurrencesOfString:@" " withString: @" " ];
someString = [ someString stringByReplacingOccurrencesOfString:@"–" withString:@"-"];
someString = [ someString stringByReplacingOccurrencesOfString:@"—" withString:@"——"];
someString = [ someString stringByReplacingOccurrencesOfString:@"‘" withString:@"'" ];
someString = [ someString stringByReplacingOccurrencesOfString:@"’" withString:@"'" ];
someString = [ someString stringByReplacingOccurrencesOfString:@"‚" withString:@"," ];
someString = [ someString stringByReplacingOccurrencesOfString:@"“" withString:@"\"" ];
someString = [ someString stringByReplacingOccurrencesOfString:@"”" withString:@"\"" ];
someString = [ someString stringByReplacingOccurrencesOfString:@"…" withString:@"..."];
someString = [ someString stringByReplacingOccurrencesOfString:@"&" withString:@"<"];
someString = [ someString stringByReplacingOccurrencesOfString:@"'" withString:@">"];
someString = [ someString stringByReplacingOccurrencesOfString:@"€" withString:@"€"];
someString = [ someString stringByReplacingOccurrencesOfString:@"→" withString:@"→"];
if(nil != self.currentItemValue){
[self.currentItemValue appendString:someString];
}
}
Is there a function to do this characters conversion automatically?
Instead of hardcoding the replacement like that, there’s a better way.
These entities are of the form:
&#+ decimal number +;. The decimal number bit is the base 10 version of that character’s unicode code point. So you could search for substrings in this format, extract the number, and convert it to a character directly.Here’s one way to do it, using RegexKitLite to find the strings:
On my machine, this logs:
This isn’t the most efficient method (it’s worst case a
O(nm)algorithm), but it’s a start. 🙂