Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8592961
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T23:52:38+00:00 2026-06-11T23:52:38+00:00

Further to this question: Handling and working with binary data HEX with python (and

  • 0

Further to this question: Handling and working with binary data HEX with python (and thanks to awesome pointers I received) I’m stuck on one last aspect of tool.

I am basically writing a cleaner for files that I have with data past the EOF marker. This extra data means they fail some validation tools. I need to strip the extra data, so they be presented to the validator, however I don’t want to throw this data away (in fact I have to keep it…)

I’ve written an XML container to hold the data, and a few other provenance/audit type values, but I’m (still) stuck on elegantly moving between raw binary and something I can “bake” in to a file.

example:

A jpg file ends with (hex editor view)
96 1a 9c fd ab 4f 9e 69 27 ad fd da 0a db 76 bb
ee d2 6a fd ff 00 ff d9 ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff ff ff ff

The EOF marker for jpg is ff d9, so the cleaner works backwards through the file until its a match against the EOF marker. In this case it would create a new jpg file stopping at the ff d9 and then attempt to write the stripped data to the XML (via the elementTree lib): changeString.text =str(excessData)

Of course this wont work as the XML writer is looking to write ASCII not binary dumps.

In the above case, the error is UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128) which I can see if because its not a valid ASCII character

My question therefore, is how do I elegantly deal with this raw data, in a way it can stored and used in the future? (I plan to write an ‘uncleaner’ next that can take the clean file and the XML and reconstruct the original file…)

______EDIT_______

Using the suggestions from below, this is the traceback:

Traceback (most recent call last):
  File "C:\...\EOF_cleaner\scripts\test6.py", line 87, in <module> main()
  File "C:\...\EOF_cleaner\scripts\test6.py", line 73, in main splitFile(f_data, offset)
  File "C:\...EOF_cleaner\scripts\test6.py", line 60, in splitFile makeXML(excessData)
  File "C:\...\EOF_cleaner\scripts\test6.py", line 53 in makeXML ET.ElementTree(root).write(noteFile)
  File "c:\python27\lib\xml\etree\ElementTree.py", line 815, in write serialize(write, self._root, encoding, qnames, namespaces)
  File "c:\python27\lib\xml\etree\ElementTree.py", line 934, in _serialize_xml_serialize_xml(write, e, encoding, qnames, None)
  File "c:\python27\lib\xml\etree\ElementTree.py", line 934, in _serialize_xml_serialize_xml(write, e, encoding, qnames, None)
  File "c:\python27\lib\xml\etree\ElementTree.py", line 934, in _serialize_xml_serialize_xml(write, e, encoding, qnames, None)
  File "c:\python27\lib\xml\etree\ElementTree.py", line 932, in _serialize_xml write(_escape_cdata(text, encoding))
  File "c:\python27\lib\xml\etree\ElementTree.py", line 1068, in _escape_cdata  return text.encode(encoding, "xmlcharrefreplace")
  UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128)

The line that throws things is changeString.text = excessData.encode('base64') (line 45) and ET.ElementTree(root).write(noteFile) (line 53)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T23:52:40+00:00Added an answer on June 11, 2026 at 11:52 pm

    Use Base64:

    excessData.encode('base64')
    

    It’ll be easy to turn that back to binary data later on with a simple .decode('base64') call.

    Base64 encodes to ASCII data safe for inclusion in XML, in a reasonably compact format; every 3 bytes of binary information become 4 Base64 characters.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

This is one column of a data frame. I want to further split into
I am taking a prior question one step further (see this question ), I
Following up this question , I have a further problem - I have two
N.B THIS QUESTION HAS BEEN UPDATED, READ FURTHER DOWN Hi, I want to create
This question follows on from a previous question, that has raised a further issue.
since I couldn't find an answer to this question I researched a bit further
This is a further question based on this answer: How can I implement a
Background (question further down) I've been Googling this back and forth reading RFCs and
Further to this question I've got a supplementary problem. I've found a track with
I asked this question on SO. However, I wish to extend it further. I

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.