Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8326075
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 9, 20262026-06-09T00:38:59+00:00 2026-06-09T00:38:59+00:00

I’m trying to figure out the lowest data-overhead way to upload/download binary data to

  • 0

I’m trying to figure out the lowest data-overhead way to upload/download binary data to Google AppEngine’s Blobstore from a JavaScript initiated HTTP request. Ideally, I would like to submit the binary data directly, i.e. as unencoded 8-bit values; maybe in a POST request that looks something like this:

...
Content-Type: multipart/form-data; boundary=boundary;

--boundary
Content-Disposition: form-data; name="a"; filename="b"
Content-Type: application/octet-stream

@#^%(^Qtr...
--boundary--

Here, @#^%(^Qtr... ideally represents arbitrary 8-bit binary data.

Specifically, I am trying to understand the following:

  • Is it possible to directly upload 8-bit binary data, or would I need to encode the data somehow, like a base-64 MIME encoding?
  • If I use a different encoding, would Blobstore save the data as 8-bit binary internally or in the encoded format? I.e. would a base-64 encoding increase my storage cost by 33%?
  • Along the same lines: Does encoding overhead increase outgoing bandwidth cost?
  • Is there a better way to format the POST request so I don’t need to come up with a boundary that doesn’t appear in my binary data? E.g. is there a way to specify a Content-Length rather than a boundary?
  • In the GET request to retrieve the data, can I simply expect to have binary data end up in the return string, or is the server going to automatically encode the data somehow?
  • If I need to use some encoding, which one would be the best choice among the supported options for essentially random 8-bit data? (base-64, UTF-8, someting else?)
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-09T00:39:00+00:00Added an answer on June 9, 2026 at 12:39 am

    Even though I received the Tumbleweed Badge for this question, let me report on my progress anyways in case somebody out there does care:

    This question turned out to pose 3 independent problems:

    1. Uploading data to BlobStore efficiently
    2. Making sure BlobStore saves it in the smallest possible format
    3. Finding a way to reliably download the data

    Let’s start with (3), because this ends up posing the biggest issue:

    So far I have not been able to find a way to download true 8-bit data to the browser via XHR. Using mime-types like application/octet-stream leads to only 7 bits reaching the client reliably, unless the data is downloaded to a file. The best solution I found, is using the following mime-type for the data:

    text/plain; charset=ISO-8859-1
    

    This seems to be supported in all browsers that I’ve tested: IE 8, Chrome 21, FF 12.0, Opera 11.61, Safari 5.1.2 under Windows, and Android 2.3.3.

    With this, it is possible to transfer almost any 8-bit value, with the following restrictions/caveats:

    • Character 0x00 is interpreted as the end of the input string in IE8 and must therefore be avoided.
    • Most browsers interpret charset ISO-8859-1 as Windows-1252 instead, leading to characters 0x80 through 0x9F being changed accordingly. This can be fixed, though, as the changes are unambiguous. (see http://en.wikipedia.org/wiki/Windows-1252#Codepage_layout)
    • Characters 0x81, 0x8D, 0x8F, 0x90, 0x9D are reserved in the Windows-1252 charset and Opera returns an error code for these, therefore these need to be avoided as well.

    Overall, this leaves us with 250 out of the 256 characters which we can use. With the required basis-change for the data, this means an outgoing-data-overhead of under 0.5%, which I guess I’m ok with.

    So, now to problem (1) and (2):

    As incoming bandwidth is free, I’ve decided to reduce the priority of solving problem (1) in favor of problems (2) and (3). Turns out, using the following POST request does the trick then:

    ...
    Content-Type: multipart/form-data; boundary=-
    
    ---
    Content-Disposition: form-data; name="a"; filename="b"
    Content-Type: text/plain; charset=ISO-8859-1
    Content-Transfer-Encoding: base64
    
    abcd==
    -----
    

    Here, abcd== is the base64-MIME-encoded data consisting of the above described 250 allowed characters (see http://en.wikipedia.org/wiki/Base64#Examples, GAE uses + and / as the last 2 characters). The encoding is necessary (correct me if I’m wrong) as calling the XHR send() function with String data will result in UTF-8 encoding of the string, which screws up the data received by the server. Unfortunately passing ArrayBuffers and Blobs to the send() function isn’t available in all browsers yet to circumvent this issue more elegantly.

    Now the good news: The AppEngine BlobStore decodes this data automatically and correctly and stores it without overhead! Therefore, using the base64-encoding only leads to slower data-uploads from the client, but does not result in additional hosting cost (unless maybe a couple CPU cycles for the decoding).

    Aside: The AppEngine development-server will report the encoded size (i.e. 33% larger) for the stored blob, both in the admin console and in a retrieved BlobInfo record. The production servers do not have this issue, though, and report the correct blob size.

    Conclusion:

    Using Content-Transfer-Encoding base64 for uploading binary data of Content-Type text/plain; charset=ISO-8859-1, which may not contain characters 0x00, 0x81, 0x8D, 0x8F, 0x90, and 0x9D, leads to reliable data transfer for many tested browsers with a storage/outgoing-bandwidth overhead of less than half a percent. The upload-overhead of the base64-encoded data is 33%, which is better than the expected 50% for UTF-8 (for random 8-bit data), but still far from desirable.

    What I don’t know is: Is this the optimal solution, or could one do better? Anyone up for the challenge?

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out
I'm new to using the Perl treebuilder module for HTML parsing and can't figure
I am trying to understand how to use SyndicationItem to display feed which is
Basically, what I'm trying to create is a page of div tags, each has
link Im having trouble converting the html entites into html characters, (&# 8217;) i
For some reason, after submitting a string like this Jack’s Spindle from a text
I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
I am trying to render a haml file in a javascript response like so:
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
I have a text area in my form which accepts all possible characters from

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.