I’m trying to process a bunch of csv files in a given directory. Each

Question

0

Asked: May 21, 20262026-05-21T14:28:52+00:00 2026-05-21T14:28:52+00:00

I’m trying to process a bunch of csv files in a given directory. Each

0

I’m trying to process a bunch of csv files in a given directory. Each time I run the script, it goes through each file in the directory (in case I’ve added new ones), and then checks agains the database to see if the file has been processed, and if so, what line to start processing from.

Problem is, the script seems to skip any file I have listed in the database table, regardless of what the status is. I’m sure I’m missing something obvious, but can’t quite piece together where my tests are going wrong.

Here’s the structure of the table:

file_processed_id | file_type | file_name | file_line | file_lines_processed | file_lines_skipped | file_status

Here’s the pertinent code:

for filename in os.listdir(path):
    status = check_process_status(filename,conn)
    if status != None:
        if status[7] == 'completed':
            pass
        else:
            start_line = status[3]
            file_to_processed = filename
            break
    else:
        start_line = 0
        file_to_be_processed = filename

And here’s the function checking the db:

def check_process_status(f,conn):

    # retrieve process status of file

    cursor = conn.cursor()

    cursor.execute("""SELECT *
                FROM files_processed
                WHERE file_type = 'faca'
                AND file_name = %s
                """,(f,))

    row = cursor.fetchone()
    if row == None:
        return None # if no entry, returns null
    else:
        return row # returns row information

I’ve tested the db connection and everything, and if the file actually exists in the table, it returns the row information just fine. The thing I don’t get is why it’s skipping to the next file each time I run the script, no matter what the “file_status” field is set to.

Any thoughts?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-21T14:28:53+00:00

Editorial Team

2026-05-21T14:28:53+00:00Added an answer on May 21, 2026 at 2:28 pm

Based on your comments, oughtn’t there be a break statement in the else clause after file_to_be_processed = filename too?

Also note that this variable is misnamed file_to_processed a few lines above.

Also note that status[7] will probably throw an exception, since there only appear to be 7 fields in your table. I’d guess it should be status[6].

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to process a bunch of csv files in a given directory. Each

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply