Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 4265850
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 21, 20262026-05-21T06:40:49+00:00 2026-05-21T06:40:49+00:00

I got interested in this small example of an algorithm in Python for looping

  • 0

I got interested in this small example of an algorithm in Python for looping through a large word list. I am writing a few “tools” that will allow my to slice a Objective-C string or array in a similar fashion as Python.

Specifically, this elegant solution caught my attention for executing very quickly and it uses a string slice as a key element of the algorithm. Try and solve this without a slice!

I have reproduced my local version using the Moby word list below. You can use /usr/share/dict/words if you do not feel like downloading Moby. The source is just a large dictionary-like list of unique words.

#!/usr/bin/env python

count=0
words = set(line.strip() for line in   
           open("/Users/andrew/Downloads/Moby/mwords/354984si.ngl"))
for w in words:
    even, odd = w[::2], w[1::2]
    if even in words and odd in words:
        count+=1

print count      

This script will a) be interpreted by Python; b) read the 4.1 MB, 354,983 word Moby dictionary file; c) strip the lines; d) place the lines into a set, and; e) and find all the combinations where the evens and the odds of a given word are also words. This executes in about 0.73 seconds on a MacBook Pro.

I tried to rewrite the same program in Objective-C. I am a beginner at this language, so go easy please, but please do point out the errors.

#import <Foundation/Foundation.h>

NSString *sliceString(NSString *inString, NSUInteger start, NSUInteger stop, 
        NSUInteger step){
    NSUInteger strLength = [inString length];

    if(stop > strLength) {
        stop = strLength;
    }

    if(start > strLength) {
        start = strLength;
    }

    NSUInteger capacity = (stop-start)/step;
    NSMutableString *rtr=[NSMutableString stringWithCapacity:capacity];    

    for(NSUInteger i=start; i < stop; i+=step){
        [rtr appendFormat:@"%c",[inString characterAtIndex:i]];
    }
    return rtr;
}

NSSet * getDictWords(NSString *path){

    NSError *error = nil;
    NSString *words = [[NSString alloc] initWithContentsOfFile:path
                         encoding:NSUTF8StringEncoding error:&error];
    NSCharacterSet *sep=[NSCharacterSet newlineCharacterSet];
    NSPredicate *noEmptyStrings = 
                     [NSPredicate predicateWithFormat:@"SELF != ''"];

    if (words == nil) {
        // deal with error ...
    }
    // ...

    NSArray *temp=[words componentsSeparatedByCharactersInSet:sep];
    NSArray *lines = 
        [temp filteredArrayUsingPredicate:noEmptyStrings];

    NSSet *rtr=[NSSet setWithArray:lines];

    NSLog(@"lines: %lul, word set: %lul",[lines count],[rtr count]);
    [words release];

    return rtr;    
}

int main (int argc, const char * argv[])
{
    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];

    int count=0;

    NSSet *dict = 
       getDictWords(@"/Users/andrew/Downloads/Moby/mwords/354984si.ngl");

    NSLog(@"Start");

    for(NSString *element in dict){
        NSString *odd_char=sliceString(element, 1,[element length], 2);
        NSString *even_char=sliceString(element, 0, [element length], 2);
        if([dict member:even_char] && [dict member:odd_char]){
            count++;
        }

    }    
    NSLog(@"count=%i",count);

    [pool drain];
    return 0;
}

The Objective-C version produces the same result, (13,341 words), but takes almost 3 seconds to do it. I must be doing something atrociously wrong for a compiled language to be more than 3X slower than a scripted language, but I’ll be darned if I can see why.

The basic algorithm is the same: read the lines, strip them, and put them in a set.

My guess of what is slow is the processing of the NSString elements, but I do not know an alternative.

Edit

I edited the Python to be this:

#!/usr/bin/env python
import codecs
count=0
words = set(line.strip() for line in 
     codecs.open("/Users/andrew/Downloads/Moby/mwords/354984si.ngl",
          encoding='utf-8'))
for w in words:
    if w[::2] in words and w[1::2] in words:
        count+=1

print count 

For the utf-8 to be on the same plane as the utf-8 NSString. This slowed the Python down to 1.9 secs.

I also switch the slice test to short-circuit type as suggested for both the Python and obj-c version. Now they are close to the same speed. I also tried using C arrays rather than NSStrings, and this was much faster, but not as easy. You also loose utf-8 support doing that.

Python is really cool…

Edit 2

I found a bottleneck that sped things up considerably. Instead of using the [rtr appendFormat:@"%c",[inString characterAtIndex:i]]; method to append a character to the return string, I used this:

for(NSUInteger i=start; i < stop; i+=step){
    buf[0]=[inString characterAtIndex:i];
    [rtr appendString:[NSString stringWithCharacters:buf length:1]];
}

Now I can finally claim that the Objective-C version is faster than the Python version — but not by much.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-21T06:40:49+00:00Added an answer on May 21, 2026 at 6:40 am

    Keep in mind that the Python version has been written to move a lot of the heavy lifting down into highly optimised C code when executed on CPython (especially the file input buffering, string slicing and the hash table lookups to check whether even and odd are in words).

    That said, you seem to be decoding the file as UTF-8 in your Objective-C code, but leaving the file in binary in your Python code. Using Unicode NSString in the Objective-C version, but 8-bit byte strings in the Python version isn’t really a fair comparison – I would expect the performance of the Python version to drop noticeably if you used codecs.open() to open the file with a declared encoding of "utf-8".

    You’re also making a full second pass to strip the empty lines in your Objective-C, while no such step is present in the Python code.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

For a small project I'm working on I need to cycle through a list.
I already know some Python and got interested in extending Blender using Python scripts.
I saw this question and it got me interested as surface caused a big
So I've got a large amount of SQL data that looks basically like this:
I implemented a Lucene search solution awhile back, and it got me interested in
got a new blog at wordpress few days ago ( http://ghads.wordpress.com ) and I
I got this error today when trying to open a Visual Studio 2008 project
I've got a list of objects (probably not more than 100), where each object
The small software team I work on recently got approved to upgrade to Visual
I've got a Rails application in which a small number of actions require significant

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.