I suspect there might be a really nice approach using…

Question

0

Asked: May 13, 20262026-05-13T07:53:25+00:00 2026-05-13T07:53:25+00:00

I’m creating a python script which accepts a path to a remote file and

0

I’m creating a python script which accepts a path to a remote file and an n number of threads. The file’s size will be divided by the number of threads, when each thread completes I want them to append the fetch data to a local file.

How do I manage it so that the order in which the threads where generated will append to the local file in order so that the bytes don’t get scrambled?

Also, what if I’m to download several files simultaneously?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-13T07:53:25+00:00

You could coordinate the works with locks &c, but I recommend instead using Queue — usually the best way to coordinate multi-threading (and multi-processing) in Python.

I would have the main thread spawn as many worker threads as you think appropriate (you may want to calibrate between performance, and load on the remote server, by experimenting); every worker thread waits at the same global Queue.Queue instance, call it workQ for example, for “work requests” (wr = workQ.get() will do it properly — each work request is obtained by a single worker thread, no fuss, no muss).

A “work request” can in this case simply be a triple (tuple with three items): identification of the remote file (URL or whatever), offset from which it is requested to get data from it, number of bytes to get from it (note that this works just as well for one or multiple files ot fetch).

The main thread pushes all work requests to the workQ (just workQ.put((url, from, numbytes)) for each request) and waits for results to come to another Queue instance, call it resultQ (each result will also be a triple: identifier of the file, starting offset, string of bytes that are the results from that file at that offset).

As each working thread satisfies the request it’s doing, it puts the results into resultQ and goes back to fetch another work request (or wait for one). Meanwhile the main thread (or a separate dedicated “writing thread” if needed — i.e. if the main thread has other work to do, for example on the GUI) gets results from resultQ and performs the needed open, seek, and write operations to place the data at the right spot.

There are several ways to terminate the operation: for example, a special work request may be asking the thread receiving it to terminate — the main thread puts on workQ just as many of those as there are working threads, after all the actual work requests, then joins all the worker threads when all data have been received and written (many alternatives exist, such as joining the queue directly, having the worker threads daemonic so they just go away when the main thread terminates, and so forth).

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions