Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3481372
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 18, 20262026-05-18T10:26:28+00:00 2026-05-18T10:26:28+00:00

Often a web service needs to zip up several large files for download by

  • 0

Often a web service needs to zip up several large files for download by the client. The most obvious way to do this is to create a temporary zip file, then either echo it to the user or save it to disk and redirect (deleting it some time in the future).

However, doing things that way has drawbacks:

  • a initial phase of intensive CPU and disk thrashing, resulting in…
  • a considerable initial delay to the user while the archive is prepared
  • very high memory footprint per request
  • use of substantial temporary disk space
  • if the user cancels the download half way through, all resources used in the initial phase (CPU, memory, disk) will have been wasted

Solutions like ZipStream-PHP improve on this by shovelling the data into Apache file by file. However, the result is still high memory usage (files are loaded entirely into memory), and large, thrashy spikes in disk and CPU usage.

In contrast, consider the following bash snippet:

ls -1 | zip -@ - | cat > file.zip
  # Note -@ is not supported on MacOS

Here, zip operates in streaming mode, resulting in a low memory footprint. A pipe has an integral buffer – when the buffer is full, the OS suspends the writing program (program on the left of the pipe). This here ensures that zip works only as fast as its output can be written by cat.

The optimal way, then, would be to do the same: replace cat with a web server process, streaming the zip file to the user with it created on the fly. This would create little overhead compared to just streaming the files, and would have an unproblematic, non-spiky resource profile.

How can you achieve this on a LAMP stack?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-18T10:26:29+00:00Added an answer on May 18, 2026 at 10:26 am

    You can use popen() (docs) or proc_open() (docs) to execute a unix command (eg. zip or gzip), and get back stdout as a php stream. flush() (docs) will do its very best to push the contents of php’s output buffer to the browser.

    Combining all of this will give you what you want (provided that nothing else gets in the way — see esp. the caveats on the docs page for flush()).

    (Note: don’t use flush(). See the update below for details.)

    Something like the following can do the trick:

    <?php
    // make sure to send all headers first
    // Content-Type is the most important one (probably)
    //
    header('Content-Type: application/x-gzip');
    
    // use popen to execute a unix command pipeline
    // and grab the stdout as a php stream
    // (you can use proc_open instead if you need to 
    // control the input of the pipeline too)
    //
    $fp = popen('tar cf - file1 file2 file3 | gzip -c', 'r');
    
    // pick a bufsize that makes you happy (64k may be a bit too big).
    $bufsize = 65535;
    $buff = '';
    while( !feof($fp) ) {
       $buff = fread($fp, $bufsize);
       echo $buff;
    }
    pclose($fp);
    

    You asked about “other technologies”: to which I’ll say, “anything that supports non-blocking i/o for the entire lifecycle of the request”. You could build such a component as a stand-alone server in Java or C/C++ (or any of many other available languages), if you were willing to get into the “down and dirty” of non-blocking file access and whatnot.

    If you want a non-blocking implementation, but you would rather avoid the “down and dirty”, the easiest path (IMHO) would be to use nodeJS. There is plenty of support for all the features you need in the existing release of nodejs: use the http module (of course) for the http server; and use child_process module to spawn the tar/zip/whatever pipeline.

    Finally, if (and only if) you’re running a multi-processor (or multi-core) server, and you want the most from nodejs, you can use Spark2 to run multiple instances on the same port. Don’t run more than one nodejs instance per-processor-core.


    Update (from Benji’s excellent feedback in the comments section on this answer)

    1. The docs for fread() indicate that the function will read only up to 8192 bytes of data at a time from anything that is not a regular file. Therefore, 8192 may be a good choice of buffer size.

    [editorial note] 8192 is almost certainly a platform dependent value — on most platforms, fread() will read data until the operating system’s internal buffer is empty, at which point it will return, allowing the os to fill the buffer again asynchronously. 8192 is the size of the default buffer on many popular operating systems.

    There are other circumstances that can cause fread to return even less than 8192 bytes — for example, the “remote” client (or process) is slow to fill the buffer – in most cases, fread() will return the contents of the input buffer as-is without waiting for it to get full. This could mean anywhere from 0..os_buffer_size bytes get returned.

    The moral is: the value you pass to fread() as buffsize should be considered a “maximum” size — never assume that you’ve received the number of bytes you asked for (or any other number for that matter).

    2. According to comments on fread docs, a few caveats: magic quotes may interfere and must be turned off.

    3. Setting mb_http_output('pass') (docs) may be a good idea. Though 'pass' is already the default setting, you may need to specify it explicitly if your code or config has previously changed it to something else.

    4. If you’re creating a zip (as opposed to gzip), you’d want to use the content type header:

    Content-type: application/zip
    

    or… ‘application/octet-stream’ can be used instead. (it’s a generic content type used for binary downloads of all different kinds):

    Content-type: application/octet-stream
    

    and if you want the user to be prompted to download and save the file to disk (rather than potentially having the browser try to display the file as text), then you’ll need the content-disposition header. (where filename indicates the name that should be suggested in the save dialog):

    Content-disposition: attachment; filename="file.zip"
    

    One should also send the Content-length header, but this is hard with this technique as you don’t know the zip’s exact size in advance. Is there a header that can be set to indicate that the content is “streaming” or is of unknown length? Does anybody know?


    Finally, here’s a revised example that uses all of @Benji’s suggestions (and that creates a ZIP file instead of a TAR.GZIP file):

    <?php
    // make sure to send all headers first
    // Content-Type is the most important one (probably)
    //
    header('Content-Type: application/octet-stream');
    header('Content-disposition: attachment; filename="file.zip"');
    
    // use popen to execute a unix command pipeline
    // and grab the stdout as a php stream
    // (you can use proc_open instead if you need to 
    // control the input of the pipeline too)
    //
    $fp = popen('zip -r - file1 file2 file3', 'r');
    
    // pick a bufsize that makes you happy (8192 has been suggested).
    $bufsize = 8192;
    $buff = '';
    while( !feof($fp) ) {
       $buff = fread($fp, $bufsize);
       echo $buff;
    }
    pclose($fp);
    

    Update: (2012-11-23) I have discovered that calling flush() within the read/echo loop can cause problems when working with very large files and/or very slow networks. At least, this is true when running PHP as cgi/fastcgi behind Apache, and it seems likely that the same problem would occur when running in other configurations too. The problem appears to result when PHP flushes output to Apache faster than Apache can actually send it over the socket. For very large files (or slow connections), this eventually causes in an overrun of Apache’s internal output buffer. This causes Apache to kill the PHP process, which of course causes the download to hang, or complete prematurely, with only a partial transfer having taken place.

    The solution is not to call flush() at all. I have updated the code examples above to reflect this, and I placed a note in the text at the top of the answer.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I often see web applications where a program is basically some javascript objects wrapping
I am obtaining strings from the web which often contain accented characters not recognised
Recently I've started using <%= more often in my Web Controls. Typically I'll set
Often when creating a new web app and configuring MySQL, certain fields will need
Say, for example, you are caching data within your ASP.NET web app that isn't
I have the following code which lets me execute a workflow. This could be
I am a newbie to Android and playing around with the UI and SQLLite
Sometimes the string values of Properties in my Classes become odd. They contain illegal
I have done my second complete Outlook 2003 Plug-in. What I want to know
I would like to know what my alternative would be for the following problem

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.