I am using the Python avro library. I want to send an avro file over http, but I don’t particularly want to save that file to disk first, so I thought I’d use StringIO to house the file contents until I’m ready to send. But avro.datafile.DataFileWriter thoughtfully takes care of closing the file handle for me, which makes it difficult for me to get the data back out of the StringIO. Here’s what I mean in code:
from StringIO import StringIO
from avro.datafile import DataFileWriter
from avro import schema, io
from testdata import BEARER, PUBLISHURL, SERVER, TESTDATA
from httplib2 import Http
HTTP = Http()
##
# Write the message data to a StringIO
#
# @return StringIO
#
def write_data():
message = TESTDATA
schema = getSchema()
datum_writer = io.DatumWriter(schema)
data = StringIO()
with DataFileWriter(data, datum_writer, writers_schema=schema, codec='deflate') as datafile_writer:
datafile_writer.append(message)
# If I return data inside the with block, the DFW buffer isn't flushed
# and I may get an incomplete file
return data
##
# Make the POST and dump its response
#
def main():
headers = {
"Content-Type": "avro/binary",
"Authorization": "Bearer %s" % BEARER,
"X-XC-SCHEMA-VERSION": "1.0.0",
}
body = write_data().getvalue() # AttributeError: StringIO instance has no attribute 'buf'
# the StringIO instance returned by write_data() is already closed. :(
resp, content = HTTP.request(
uri=PUBLISHURL,
method='POST',
body=body,
headers=headers,
)
print resp, content
I do have some workarounds I can use, but none of them are terribly elegant. Is there any way to get the data from the StringIO after it’s closed?
Not really.
The docs are very clear on this:
The cleanest way of doing it would be to inherit from StringIO and override the
closemethod to do nothing:And call
_close()when you’re ready.