I am looking for the most efficient way to match an NSString against about 200,000 other strings in the minimum amount of time. Basically I need to check, every once in a while, whether or not a word that is inputted is part of the English language.
What are the ways I could do this? I’ve heard about hashtables– is that the best way?
Here is the code I settled on:
EDIT:
benchmarks for dictionary initialization to memory:
NSDate *start = [NSDate date];
NSLog(@"initWords");
//temporary
NSString *_filePath = [[NSBundle mainBundle] pathForResource:kFILE_NAME_FOR_DICTIONARY ofType:@"txt"];
NSLog(@"%@",_filePath);
NSString *_fileContents = [[NSString alloc] init];
NSData *_binary = [NSData dataWithContentsOfFile:_filePath];
if (_binary) {
_fileContents = [[NSString alloc] initWithData:_binary encoding:NSUTF8StringEncoding];
} else {
NSLog(@"file parse error: did you forget to add the file? Please add file to:\n\n\n\n%@\n\n\n\n",[[NSBundle mainBundle] resourcePath]);
}
NSArray *_wordList = [_fileContents componentsSeparatedByString:kNEW_LINE_CHAR];
englishDictionary = [[NSMutableSet alloc] init];
[englishDictionary addObjectsFromArray:_wordList];
NSLog(@"Word count:\t%d",englishDictionary.count);
NSLog(@"Time to init dictionary:\t%f",[start timeIntervalSinceNow]*-1.);
iphone 5: 1.089725 (seconds)
ipad 1: 3.082753
iphone 4: 3.582853
Benchmark (time to test word if it is in the dictionary or not):
-(BOOL)checkWord:(NSString *)word{
NSDate *start = [NSDate date];
BOOL yesNoMaybeSo = [englishDictionary containsObject:word];
NSLog(@"Time to check word:\t%f",[start timeIntervalSinceNow]*-1.);
return yesNoMaybeSo;
}
iphone 5: 0.000021 (seconds)
ipad 1: 0.000037
iphone 4: 0.000043
The most efficient way is probably to use a
NSSetto store all the words you want to compare against your string.Then you simply check whether your word belongs to the set like follows
This will take constant time to perform.