I’ve been at this a while now, and I think it in my best

Question

0

Asked: June 18, 20262026-06-18T23:52:30+00:00 2026-06-18T23:52:30+00:00

I’ve been at this a while now, and I think it in my best

0

I’ve been at this a while now, and I think it in my best interest to ask advice of the experts. I know I’m not writing this the best way possible, and I’ve gone down a rabbit hole and confused myself.

I have a csv. A bunch, actually. That part is not the problem.

The lines at the top of the CSV are not really CSV data, but it does contain an important piece of info, the data for which the data is valid. For certain kinds of a report, it is on one line, and on others another.

My data starts on some line down from the top, usually 10 or 11, but I can’t always be certain. I do know that the first column always has the same info (the header of the table of data).

I want to pull the report date from the preceding lines, and for file type A, do stuffA, and for file tpye B, do stuffB, then write out that row to a new file. I’m having a problem incrementing the row and I have no idea what I’m doing wrong.

Sample data:

"Attribute ""OPSURVEYLEVEL2_O"" [Category = ""Retail v1""]"
Date exported: 2/16/13
Exported by user: William
Project: 
Classification: Online Retail v1
Report type: Attributes
Date range: from 12/14/12 to 12/14/12
"Filter OpSurvey Level 2(mine):  [ LEVEL:SENTENCE TYPE:KEYWORD {OPSURVEYLEVEL2_O:""gift certificate redemption"", OPSURVEYLEVEL2_O:""combine accounts"", OPSURVEYLEVEL2_O:""cancel account"", OPSURVEYLEVEL2_O:""saved project moved to purchased project"", OPSURVEYLEVEL2_O:""unlock account"", OPSURVEYLEVEL2_O:""affiliate promotions"", OPSURVEYLEVEL2_O:""print to store coupons"", OPSURVEYLEVEL2_O:""disclaimer not clear"", OPSURVEYLEVEL2_O:""prepaid issue"", OPSURVEYLEVEL2_O:""customer wants to use coupons for print to store"", OPSURVEYLEVEL2_O:""customer received someone else's order"", OPSURVEYLEVEL2_O:""hi-res images unavailable"", OPSURVEYLEVEL2_O:""how to re-order"", OPSURVEYLEVEL2_O:""missing items"", OPSURVEYLEVEL2_O:""missing envelopes: print to store"", OPSURVEYLEVEL2_O:""missing envelopes: mail order"", OPSURVEYLEVEL2_O:""group rooms"", OPSURVEYLEVEL2_O:""print to store"", OPSURVEYLEVEL2_O:""print to store coupons"", OPSURVEYLEVEL2_O:""publisher: card not available for print to store"", OPSURVEYLEVEL2_O:publisher}]"
Total: 905
OPSURVEYLEVEL2_O,Distinct Document,% of Document,Sentiment Score
PRINT TO STORE,297,32.82,-0.1
...

Sample Code

#!/usr/bin/python

import csv, os, glob, sys, errno

path = '/path/to/Downloads'
for infile in glob.glob(os.path.join(path,'report_ATTRIBUTE_OP*.csv')):
    if 'OPSURVEYLEVEL2' in infile:
        prime_column = 'ops2'
    elif 'OPSURVEYLEVEL3' in infile:
        prime_column = 'ops3'
    else:
        sys.exit(errno.ENOENT)
    with open(infile, "r") as csvfile:
        reader = csv.reader(csvfile)
        report_date = 'DATE NOT FOUND'
        # import pdb; pdb.set_trace()
        for row in reader:
            foo = 0
            while foo < 1: 
                if row[0][0:].find('OPSURVEYLEVEL') == 0:
                    foo = 1
                if "Date range" in row:
                    report_date = row[0][-8:]
                break
            if foo >= 1:
                if row[0][0:].find('OPSURVEYLEVEL') == 0:
                    break
                if 'ops2' in prime_column:
                    dup_col = row[0]
                    row.insert(0,dup_col)
                    row.append(report_date)
                elif 'ops3' in prime_column:
                    row.append(report_date)
                with open('report_merge.csv', 'a') as outfile:
                    outfile.write(row)
            reader.next()

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-18T23:52:31+00:00

There are two problems that I can see in this code.

The first is that the code won’t find the date range in row. The line:

if "Date range" in row:

… should be:

if "Date range" in row[0]:

The second is that the code:

if row[0][0:].find('OPSURVEYLEVEL') == 0:
    break

… is breaking out of the for loop after the header line of the data table, because that is the closest enclosing loop. I suspect that there was another while in there somewhere in a previous version of this code.

The code is simpler (and bug-free) with an if statement instead of the while and if, as follows:

    for row in reader:
        if foo < 1: 
            if row[0][0:].find('OPSURVEYLEVEL') == 0:
                foo = 1
            if "Date range" in row[0]:  # Changed this line
                print("found report date")
                report_date = row[0][-8:]
        else:
            print(row)
            if row[0][0:].find('OPSURVEYLEVEL') == 0:
                break
            if 'ops2' in prime_column:
                dup_col = row[0]
                row.insert(0,dup_col)
                row.append(report_date)
            elif 'ops3' in prime_column:
                row.append(report_date)
            with open('report_merge.csv', 'a') as outfile:
                outfile.write(','.join(row)+'\n')

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve been at this a while now, and I think it in my best

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply