I have the following function, which returns a filesize of a file over HTTP:

Question

0

Asked: May 28, 20262026-05-28T15:22:20+00:00 2026-05-28T15:22:20+00:00

I have the following function, which returns a filesize of a file over HTTP:

0

I have the following function, which returns a filesize of a file over HTTP:

def GetFileSize(url):
    " Function gets a url and returns it's filesize in bytes "
    url = url.replace(' ', '%20')
    u = urllib2.urlopen(url)
    meta = u.info()
    file_size = int(meta.getheaders("Content-Length")[0])
    return file_size

I would like to get the biggest file from a given links, and I wrote the following function for it:

def GetBiggestFile(links):
    " Function gets a list of links and returns the biggest file and his size in bytes "
    dic = {}
    for link in links:
        filename = link.split('/')[-1]
        filesize = GetFileSize(link)
        dic[link] = filesize
        print "%s | %.2f MB" % (filename, filesize / 1024.0 / 1024.0)

    biggest_file = max(dic, key=dic.get)
    return biggest_file, dic[biggest_file]

My lists have dozens of links, therefore this scripts takes some time to complete. Using threading I can fetch the different filesizes synchronously and shorten the running time of the code.

I’m not so sure how to do it – I’ve tried using a decorator that makes the function run asynchronously:

def run_async(func):
    " Decorator for running functions asynchronously. "
    from threading import Thread
    from functools import wraps

    @wraps(func)
    def async_func(*args, **kwargs):
        func_hl = Thread(target = func, args = args, kwargs = kwargs)
        func_hl.start()
        return func_hl

    return async_func

But I’m not sure how to make my code wait for all the answers before trying to determine who is the biggest file.

Thanks.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T15:22:22+00:00

Editorial Team

2026-05-28T15:22:22+00:00Added an answer on May 28, 2026 at 3:22 pm

You’ll be happier with multiprocessing.

Start with this example: http://docs.python.org/library/multiprocessing.html#using-a-pool-of-workers

Your GetFileSize function can be run in a process pool.

Since each process is separate, you should also have an “output Queue” into which the results are put. A separate process does a simple “get” to retrieve all the answers from the Queue.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have the following function, which returns a filesize of a file over HTTP:

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply