Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8268147
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 8, 20262026-06-08T05:46:26+00:00 2026-06-08T05:46:26+00:00

I have a particularly nasty fixed-width file to work with. It’s not encoded in

  • 0

I have a particularly nasty fixed-width file to work with. It’s not encoded in the formats I thought it would be. In a nutshell, I’m trying to do a variety of things:

  1. Skip all the whitespace before RPLY01.
  2. Remove those wacky \x00* characters. I’ve looked into removing them but the character that is covered in the asterisk (*) failed to be removed no matter what I have tried.
  3. Split at FDXSPD01/
  4. Split at CHKP01.
  5. Eventually, split every CSV into a nice list and trap all the other crap (RPLY and CHKP) into a parse-able friendly format.

Here is the original output:

�cRPLY01  IREQ    0000011                                                                         
N00    �9FDXSPD01"CASH","","",10219575.34,0.00,0,"000000000000773"
�EFDXSPD01"CAD","CANADA DOLLAR","CU",-14564.52,0.00,0,"000000000000773"    
�PFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",3644.00,0.00,0,"000000000000773"
�QFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",-3641.07,0.00,0,"000000000000773"
�PFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",1457.00,0.00,0,"000000000000773"
�QFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",-1456.43,0.00,0,"000000000000773"
�DFDXSPD01"CD5427","JOHN CD","CD",100000.00,197.95,0,"000000000000773"
�LFDXSPD01"CP5427","COMMERCIAL PAPER","CP",9925000.00,0.00,0,"000000000000773"
�JFDXSPD01"FS5427","JOHNFORSTOCK","FS",81000.00,10000.00,0,"000000000000773"
�FFDXSPD01"FUT5427","JOHNFUTURE","FT",264000.00,0.00,0,"000000000000773"
�BFDXSPD01"JKSTOCK","JK STOCK","S",31500.00,0.00,0,"000000000000773"
�LFDXSPD01"MB5427","JOHN MUNI BOND","M",255000.00,15611.92,0,"000000000000773"
�QFDXSPD01"MBS5427","JOHNMORTGAGEBACKED","G1",996500.00,2916.67,0,"000000000000773"
�EFDXSPD01"OPT5427","JOHNOPTION","O",464000.00,0.00,0,"000000000000773"
�CFDXSPD01"TB5427","TREASURY BILL","TI",0.00,0.00,0,"000000000000773"
�HFDXSPD01"UB5427","JOHN BOND","G",2994000.00,13281.26,0,"000000000000773"
�9FDXSPD01"UNITS","UNITS","S",0.00,0.00,0,"000000000000773"
�CHKP01  N0000000170

Here’s the same output but through repr:

\x00cRPLY01  IREQ    0000011
N00    \x009FDXSPD01"CASH","","",10219575.34,0.00,0,"000000000000773"
\x00EFDXSPD01"CAD","CANADA DOLLAR","CU",-14564.52,0.00,0,"000000000000773"
\x00PFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",3644.00,0.00,0,"000000000000773"
\x00QFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",-3641.07,0.00,0,"000000000000773"
\x00PFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",1457.00,0.00,0,"000000000000773"
\x00QFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",-1456.43,0.00,0,"000000000000773"
\x00DFDXSPD01"CD5427","JOHN CD","CD",100000.00,197.95,0,"000000000000773"
\x00LFDXSPD01"CP5427","COMMERCIAL PAPER","CP",9925000.00,0.00,0,"000000000000773"
\x00JFDXSPD01"FS5427","JOHNFORSTOCK","FS",81000.00,10000.00,0,"000000000000773"
\x00FFDXSPD01"FUT5427","JOHNFUTURE","FT",264000.00,0.00,0,"000000000000773"
\x00BFDXSPD01"JKSTOCK","JK STOCK","S",31500.00,0.00,0,"000000000000773"
\x00LFDXSPD01"MB5427","JOHN MUNI BOND","M",255000.00,15611.92,0,"000000000000773"
\x00QFDXSPD01"MBS5427","JOHNMORTGAGEBACKED","G1",996500.00,2916.67,0,"000000000000773"
\x00EFDXSPD01"OPT5427","JOHNOPTION","O",464000.00,0.00,0,"000000000000773"
\x00CFDXSPD01"TB5427","TREASURY BILL","TI",0.00,0.00,0,"000000000000773"
\x00HFDXSPD01"UB5427","JOHN BOND","G",2994000.00,13281.26,0,"000000000000773"
\x009FDXSPD01"UNITS","UNITS","S",0.00,0.00,0,"000000000000773"
\x00\x13CHKP01  N0000000170'

Here’s my stream of thought thus far:

#!/usr/local/bin/python
import string
import struct
import re
import array

cols = []
splitcols = []

def removeNonAscii(s): return "".join(i for i in s if ord(i) < 128)
rgx = r'[\x00-\x20\x22\x2F\x3A\x3C\x3E\x5C]'

fname2 = open('/Users/abcd/Documents/SVN/scripts/Python/project/I1DL/output.txt', 'r')
fname = open('/Users/abcd/Documents/SVN/scripts/Python/project/I1DL/output.txt', 'r')
ename2 = open('/Users/abcd/Documents/SVN/scripts/Python/project/I1DL/error_output.txt', 'r')
ename = open('/Users/abcd/Documents/SVN/scripts/Python/project/I1DL/error_output.txt', 'r')
losfile = open('/Users/abcd/Documents/SVN/scripts/Python/project/I1DL/error_output.txt', 'r')
losfile2 = open('/Users/abcd/Documents/SVN/scripts/Python/project/I1DL/error_output.txt', 'r')

filename = '/Users/abcd/Documents/SVN/scripts/Python/project/I1DL/output.txt'
filename2 = '/Users/abcd/Documents/SVN/scripts/Python/project/I1DL/error_output.txt'

almost = open('/Users/abcd/Documents/SVN/scripts/Python/project/I1DL/output.txt', 'r')
almost2 = open('/Users/abcd/Documents/SVN/scripts/Python/project/I1DL/output.txt', 'r')

for line in fname.read().split('\t'):
    print repr(line)
    filtered_string = filter(lambda x: x in string.printable, line)
    print "unfilter line: " + line
    print "filtered line: " + filtered_string
    removeNonAscii(line)

print "======================================================================================"

for line in ename.read().split('\t'):
    filtered_string = filter(lambda x: x in string.printable, line)
    print "error unfilter line: " + line
    print "error filtered line: " + filtered_string

print "======================================================================================"

for line in ename2.read().split('\t'):
    for chars in line:
        if chars in string.printable:
            print chars
        else:
            print "!!!!!!!!!!!!!!!"

print "======================================================================================"

#for line in fname2.read().split('\t'):
    #for chars in line:
        #if chars in string.printable:
            #print chars
        #else:
            #print "@@@@@@@@@@@@@@@"

print "======================================================================================"
print "======================================================================================"

testlist = []
testlist2 = []

with open(filename) as f:
    readarray = f.readline()
    for eachitem in readarray[6:]:
        if(all(ord(c) < 127 and c in string.printable for c in eachitem)):
            testlist.append(eachitem)
print repr(''.join(testlist))

with open(filename2) as f:
    readarray2 = f.readline()
    for eachitem in readarray2[6:]:
        testlist2.append(removeNonAscii(eachitem))
print repr(''.join(testlist2))

with open(filename) as f:
    readarray = f.readline()
    print repr(readarray[6:].split(' FDXSPD01'))
    for eachitem in readarray[6:]:
        if(all(ord(c) < 127 and c in string.printable for c in eachitem)):
            testlist.append(eachitem)
print repr(''.join(testlist))
#for eachitem in almost.read():
    #if(all(ord(c) < 127 and c in string.printable for c in eachitem)):
        #testlist.append(eachitem)
#print ''.join(testlist)

#for eachitem in losfile2.read():
    #testlist2.append(removeNonAscii(eachitem))
#print ''.join(testlist2)

Maybe it’s lack of sleep but I can’t seem to find the right answer. Perhaps someone well versed in Python can show me the way.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-08T05:46:28+00:00Added an answer on June 8, 2026 at 5:46 am

    Okay, I’m also somewhat asleep, but I think coffee is starting to kick in.

    Going by the repr() I think this just about does what you want, albeit not necessarily efficiently or robustly.

    import csv, re
    
    data="""
    \x00cRPLY01  IREQ    0000011
    N00    \x009FDXSPD01"CASH","","",10219575.34,0.00,0,"000000000000773"
    \x00EFDXSPD01"CAD","CANADA DOLLAR","CU",-14564.52,0.00,0,"000000000000773"
    \x00PFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",3644.00,0.00,0,"000000000000773"
    \x00QFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",-3641.07,0.00,0,"000000000000773"
    \x00PFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",1457.00,0.00,0,"000000000000773"
    \x00QFDXSPD01"CCTUSD","CURRENCY CONTRACT - USD","CC",-1456.43,0.00,0,"000000000000773"
    \x00DFDXSPD01"CD5427","JOHN CD","CD",100000.00,197.95,0,"000000000000773"
    \x00LFDXSPD01"CP5427","COMMERCIAL PAPER","CP",9925000.00,0.00,0,"000000000000773"
    \x00JFDXSPD01"FS5427","JOHNFORSTOCK","FS",81000.00,10000.00,0,"000000000000773"
    \x00FFDXSPD01"FUT5427","JOHNFUTURE","FT",264000.00,0.00,0,"000000000000773"
    \x00BFDXSPD01"JKSTOCK","JK STOCK","S",31500.00,0.00,0,"000000000000773"
    \x00LFDXSPD01"MB5427","JOHN MUNI BOND","M",255000.00,15611.92,0,"000000000000773"
    \x00QFDXSPD01"MBS5427","JOHNMORTGAGEBACKED","G1",996500.00,2916.67,0,"000000000000773"
    \x00EFDXSPD01"OPT5427","JOHNOPTION","O",464000.00,0.00,0,"000000000000773"
    \x00CFDXSPD01"TB5427","TREASURY BILL","TI",0.00,0.00,0,"000000000000773"
    \x00HFDXSPD01"UB5427","JOHN BOND","G",2994000.00,13281.26,0,"000000000000773"
    \x009FDXSPD01"UNITS","UNITS","S",0.00,0.00,0,"000000000000773"
    \x00\x13CHKP01  N0000000170'"""
    
    lines = data.split('\x00')
    for line in lines:
        try:
            pos = line.index('"')
            print 'ROW:', next(csv.reader([line[pos:]]))
        except ValueError as e:
            try:
                print 'OTH:', line[next(re.finditer('RPLY|CHKP', line)).start():].split()
            except StopIteration as e:
                print 'XXX:', line
    

    prints

    OTH: XXX: 
    
    OTH: ['RPLY01', 'IREQ', '0000011', 'N00']
    ROW: ['CASH', '', '', '10219575.34', '0.00', '0', '000000000000773']
    ROW: ['CAD', 'CANADA DOLLAR', 'CU', '-14564.52', '0.00', '0', '000000000000773']
    ROW: ['CCTUSD', 'CURRENCY CONTRACT - USD', 'CC', '3644.00', '0.00', '0', '000000000000773']
    ...
    ROW: ['TB5427', 'TREASURY BILL', 'TI', '0.00', '0.00', '0', '000000000000773']
    ROW: ['UB5427', 'JOHN BOND', 'G', '2994000.00', '13281.26', '0', '000000000000773']
    ROW: ['UNITS', 'UNITS', 'S', '0.00', '0.00', '0', '000000000000773']
    OTH: ['CHKP01', "N0000000170'"]
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a lein project (using cascalog--but that's not particularly important). I'm trying to
I have a particularly difficult business constraint that I would like to enforce at
I have a particularly difficult form that I am trying to click the search
I have recently run into a particularly sticky issue regarding committing the result of
I have seen a lot of people in the C++ community(particularly ##c++ on freenode)
I have a question about Spring, particularly the MVC component. I have a jsp
I have a general question about the way that database indexing works, particularly in
In the end, I have decided that this isn't a problem that I particularly
I am new to ASP.NET MVC, particularly ajax operations. I have a form with
I'm writing an intranet ASP.NET page using VB.NET. I've run into a particularly nasty

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.