I have some JSON formatted log files that I am copying to S3 so

Question

0

Asked: May 23, 20262026-05-23T18:38:35+00:00 2026-05-23T18:38:35+00:00

I have some JSON formatted log files that I am copying to S3 so

0

I have some JSON formatted log files that I am copying to S3 so I can run Hive queries on them using Elastic Map Reduce. The script I use to copy the log files to S3 is written in Python.

Every once in a while I encounter a file with an incomplete line, typically at the end of the file. This causes any Hive queries that need that file to fail. I’ve been manually fixing the files by removing the bad line, but I’d like to integrate this step into my Python script to prevent these failures.

Here’s an example of the type of file I’m working with:

{"logLine":{"browserName":"FireFox","userAgent":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0"}}
{"logLine":{"browserName":"Pre","userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.24 (KHTML, like Gecko; Google Web Preview) Chrome/11.0.696 Safari/534.24"}}
{"logLine":{"browserName":"Internet Explorer","userAgent":"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1

In that case I want to remove the last line since it’s incomplete. I know it’s incomplete because it’s missing the end of line character(s), and also because it’s not valid JSON due to the missing end quote and curly braces.

Is there an easy way to identify and remove that file from the file using Python?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T18:38:36+00:00

Editorial Team

2026-05-23T18:38:36+00:00Added an answer on May 23, 2026 at 6:38 pm

Python has a json module in its standard library. It has a parser that will raise an exception if the input isn’t valid JSON. To check the last line, you could do something like

import json
with open('log.txt') as file:
    lines = file.readlines()
try:
    json.loads(lines[-1])
except ValueError:
    with open('log.txt', 'w') as file:
        file.write(''.join(lines[:-1]))

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have some JSON formatted log files that I am copying to S3 so

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply