I’m using the cloudfile module to upload files to rackspace cloud files, using something like this pseudocode:
import cloudfiles
username = '---'
api_key = '---'
conn = cloudfiles.get_connection(username, api_key)
testcontainer = conn.create_container('test')
for f in get_filenames():
obj = testcontainer.create_object(f)
obj.load_from_filename(f)
My problem is that I have a lot of small files to upload, and it takes too long this way.
Buried in the documentation, I see that there is a class ConnectionPool, which supposedly can be used to upload files in parallell.
Could someone please show how I can make this piece of code upload more than one file at a time?
The
ConnectionPoolclass is meant for a multithreading application that ocasionally has to send something to rackspace.That way you can reuse your connection but you don’t have to keep 100 connections open if you have 100 threads.
You are simply looking for a multithreading/multiprocessing uploader.
Here’s an example using the
multiprocessinglibrary: