I’m reading a binary file made up of records that in C would look like this:
typedef _rec_t
{
char text[20];
unsigned char index[3];
} rec_t;
Now I’m able to parse this into a tuple with 23 distinct values, but would prefer if I could use namedtuple to combine the first 20 bytes into text and the three remaining bytes into index. How can I achieve that? Basically instead of one tuple of 23 values I’d prefer to have two tuples of 20 and 3 values respectively and access these using a “natural name”, i.e. by means of namedtuple.
I am currently using the format "20c3B" for struct.unpack_from().
Note: There are many consecutive records in the string when I call parse_text.
My code (stripped down to the relevant parts):
#!/usr/bin/env python
import sys
import os
import struct
from collections import namedtuple
def parse_text(data):
fmt = "20c3B"
l = len(data)
sz = struct.calcsize(fmt)
num = l/sz
if not num:
print "ERROR: no records found."
return
print "Size of record %d - number %d" % (sz, num)
#rec = namedtuple('rec', 'text index')
empty = struct.unpack_from(fmt, data)
# Loop through elements
# ...
def main():
if len(sys.argv) < 2:
print "ERROR: need to give file with texts as argument."
sys.exit(1)
s = os.path.getsize(sys.argv[1])
f = open(sys.argv[1])
try:
data = f.read(s)
parse_text(data)
finally:
f.close()
if __name__ == "__main__":
main()
According to the docs: http://docs.python.org/library/struct.html
so in your case
slicing the unpack variables maybe a problem, if the format was
fmt = "20si"or something standard where we don’t return sequential bytes, we wouldn’t need to do this.