Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6349337
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T21:35:45+00:00 2026-05-24T21:35:45+00:00

(I’ve edited this for clarity, and changed the actual question a bit based on

  • 0

(I’ve edited this for clarity, and changed the actual question a bit based on EOL’s answer)
I’m trying to translate the following function in C to Python but failing miserably (see C code below). As I understand it, it takes four 1-byte chars starting from the memory location pointed to by from, treats them as unsigned long ints in order to give each one 4 bytes of space, and does some bitshifting to arrange them as a big-endian 32-bit integer. It’s then used in an algorithm of checking file validity. (from the Treaty of Babel)

static int32 read_alan_int(unsigned char *from)
{
 return ((unsigned long int) from[3])| ((unsigned long int)from[2] << 8) |
       ((unsigned long int) from[1]<<16)| ((unsigned long int)from[0] << 24);
}
/*
  The claim algorithm for Alan files is:
   * For Alan 3, check for the magic word
   * load the file length in blocks
   * check that the file length is correct
   * For alan 2, each word between byte address 24 and 81 is a
      word address within the file, so check that they're all within
      the file
   * Locate the checksum and verify that it is correct
*/
static int32 claim_story_file(void *story_file, int32 extent)
{
 unsigned char *sf = (unsigned char *) story_file;
 int32 bf, i, crc=0;
 if (extent < 160) return INVALID_STORY_FILE_RV;
 if (memcmp(sf,"ALAN",4))
 { /* Identify Alan 2.x */
 bf=read_alan_int(sf+4);
 if (bf > extent/4) return INVALID_STORY_FILE_RV;
 for (i=24;i<81;i+=4)
 if (read_alan_int(sf+i) > extent/4) return INVALID_STORY_FILE_RV;
 for (i=160;i<(bf*4);i++)
 crc+=sf[i];
 if (crc!=read_alan_int(sf+152)) return INVALID_STORY_FILE_RV;
 return VALID_STORY_FILE_RV;
 }
 else
 { /* Identify Alan 3 */
   bf=read_alan_int(sf+12);
   if (bf > (extent/4)) return INVALID_STORY_FILE_RV;
   for (i=184;i<(bf*4);i++)
    crc+=sf[i];
 if (crc!=read_alan_int(sf+176)) return INVALID_STORY_FILE_RV;

 }
 return INVALID_STORY_FILE_RV;
}

I’m trying to reimplement this in Python. For implementing the read_alan_int function, I would think that importing struct and doing struct.unpack_from('>L', data, offset) would work. However, on valid files, this always returns 24 for the value bf, which means that the for loop is skipped.

def read_alan_int(file_buffer, i):
    i0 = ord(file_buffer[i]) * (2 ** 24)
    i1 = ord(file_buffer[i + 1]) * (2 ** 16)
    i2 = ord(file_buffer[i + 2]) * (2 ** 8)
    i3 = ord(file_buffer[i + 3])
    return i0 + i1 + i2 + i3

def is_a(file_buffer):
    crc = 0
    if len(file_buffer) < 160:
        return False
    if file_buffer[0:4] == 'ALAN':
        # Identify Alan 2.x
        bf = read_alan_int(file_buffer, 4)
        if bf > len(file_buffer)/4:
            return False
        for i in range(24, 81, 4):
            if read_alan_int(file_buffer, i) > len(file_buffer)/4:
                return False
        for i in range(160, bf * 4):
            crc += ord(file_buffer[i])
        if crc != read_alan_int(file_buffer, 152):
            return False
        return True
    else:
        # Identify Alan 3.x
        #bf = read_long(file_buffer, 12, '>')
        bf = read_alan_int(file_buffer, 12)
        print bf
        if bf > len(file_buffer)/4:
            return False
        for i in range(184, bf * 4):
            crc += ord(file_buffer[i])
        if crc != read_alan_int(file_buffer, 176):
            return False
        return True
    return False


if __name__ == '__main__':
    import sys, struct
    data = open(sys.argv[1], 'rb').read()
    print is_a(data)

…but the damn thing still returns 24. Unfortunately, my C skills are non-existent so I’m having trouble getting the original program to print some debug output so I can know what bf is supposed to be.

What am I doing wrong?


Ok, so I’m apparently doing read_alan_int correctly. However, what’s failing for me is the check that the first 4 characters are “ALAN”. All of my test files fail this test. I’ve changed the code to remove this if/else statement and to instead just take advantage of early returns, and now all of my unit tests pass. So, on a practical level, I’m done. However, I’ll keep the question open to address the new problem: how can I possibly wrangle the bits to get “ALAN” out of the first 4 chars?

def is_a(file_buffer):
    crc = 0
    if len(file_buffer) < 160:
        return False
    #if file_buffer.startswith('ALAN'):
        # Identify Alan 2.x
    bf = read_long(file_buffer, 4)
    if bf > len(file_buffer)/4:
        return False
    for i in range(24, 81, 4):
        if read_long(file_buffer, i) > len(file_buffer)/4:
            return False
    for i in range(160, bf * 4):
        crc += ord(file_buffer[i])
    if crc == read_long(file_buffer, 152):
        return True
    # Identify Alan 3.x
    crc = 0
    bf = read_long(file_buffer, 12)
    if bf > len(file_buffer)/4:
        return False
    for i in range(184, bf * 4):
        crc += ord(file_buffer[i])
    if crc == read_long(file_buffer, 176):
        return True
    return False
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T21:35:46+00:00Added an answer on May 24, 2026 at 9:35 pm

    Ah, I think I’ve got it. Note that the description says

    /*
      The claim algorithm for Alan files is:
       * For Alan 3, check for the magic word
       * load the file length in blocks
       * check that the file length is correct
       * For alan 2, each word between byte address 24 and 81 is a
          word address within the file, so check that they're all within
          the file
       * Locate the checksum and verify that it is correct
    */
    

    which I read as saying that there’s a magic word in Alan 3, but not in Alan 2. However, your code goes the other way, even though the C code only assumes that the ALAN exists for Alan 3 files.

    Why? Because you don’t speak C, so you guessed — naturally enough! — that memcmp would return (the equivalent of a Python) True if the first four characters of sf and “ALAN” are equal.. but it doesn’t. memcmp returns 0 if the contents are equal, and nonzero if they differ.

    And that seems to be the way it works:

    >>> import urllib2
    >>> 
    >>> alan2 = urllib2.urlopen("http://ifarchive.plover.net/if-archive/games/competition2001/alan/chasing/chasing.acd").read(4)
    >>> alan3 = urllib2.urlopen("http://mirror.ifarchive.org/if-archive/games/competition2006/alan/enterthedark/EnterTheDark.a3c").read(4)
    >>> 
    >>> alan2
    '\x02\x08\x01\x00'
    >>> alan3
    'ALAN'
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
link Im having trouble converting the html entites into html characters, (&# 8217;) i
Does anyone know how can I replace this 2 symbol below from the string
this is what i have right now Drawing an RSS feed into the php,
I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out
I ran into a problem. Wrote the following code snippet: teksti = teksti.Trim() teksti
I have this code: - (void)parser:(NSXMLParser *)parser foundCDATA:(NSData *)CDATABlock { NSString *someString = [[NSString
I am trying to loop through a bunch of documents I have to put
I have some data like this: 1 2 3 4 5 9 2 6
We are using XSLT to translate a RIXML file to XML. Our RIXML contains

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.