I know (from the answer to this question: .rar, .zip files MIME Type) that that most people check zip files in PHP as application/zip or application/octet-stream, but I have a couple of questions about this:
- is it safe just to check for
application/octet-stream(given thatapplication/octet-streamcan be used to describe many more file types than just zip!). I know I could check the file in other ways too, but thought I should try and keep everything as simple as possible - I’ve tried to check for as many different actual zip types as possible; but, there are some which give some unexpected results. I’ve found 1 for which the mime-type is
application/x-external-editor, but PHP has problems dealing with it (although the only error I get isWarning: ZipArchive::close() [ziparchive.close]: Invalid or unitialized Zip object) – is this documented anywhere? Is there a list of actualx-mimetypes which PHP can cope with?
Edit
In answer to the questions below:
- I’m checking the mime type by using
$_FILES['fileatt']['type'], but usingmime_content_type()gives the same result. Different zip files seem to be any one of the following:'application/zip','application/x-compressed','application/x-zip-compressed','application/x-compressed','multipart/x-zip'. I didn’t understand why I got an error when the mime type was detected as beingapplication/x-external-editor. - I have got the zip extension installed, and I am extracting all the files from the zip files when they are uploaded. I hadn’t thought about checking the error.
I have also found another thing I don’t quite understand: when I use the following code with a file which PHP reads as application/x-external-editor:
if($zip->open($_FILES[fileatt]['tmp_name'])===TRUE)
{
echo "success";
} else {
echo "error";
}
prints "error", but checking the file type as
$res = $zip->open($_FILES[fileatt]['tmp_name']);
if($res)
{
echo "success";
} else {
echo "error";
}
prints "success"; in this code, I assume that the boolean is effectively using ==, not ===, but why should this make a difference?
The error:
$res = $zip->open($_FILES[fileatt]['tmp_name']);
if($res===TRUE)
{
echo "success";
} else {
echo $res;
}
prints 19 – which error (https://www.php.net/manual/en/ziparchive.open.php) does 19 refer to?!
Never trust the mime type, this can be easily spoofed by the client. They could submit an exe and give it a mime type of
text/plainif they wanted to.All zip files begin with a standard local file header signature (0x04034b50) so you could check that the first 4 bytes of the file match the zip signature bytes. See the PKZIP Appnote for more details.
If you have the zip extension enabled, you can go even further and attempt to open and read the zip to make sure it is a fully valid zip file.
Something like this works well:
zip_openreturns a resource if opened successfully, otherwise an integer representing the error that occurred reading the file.EDIT: To elaborate on some of your questions:
About
application/octet-stream: This is as you said, a very generic type. This just means any file that contains 8-bit data which is basically everything and anything.application/zipis the de-facto standard mime-type, but some clients will use other values as you have discovered. Also given the fact that a client can easily spoof any file type to useapplication/zipI wouldn’t rely on$_FILES['fileatt']['type']since it can be anything.AFIK,
mime_content_type()simply looks at the file extension and maps it to a mime type from a mime.types file on the system or built into PHP. If someone put a.zipextension on anexefile it would still register asapplication/zip. I beleive certain extensions may examine the file header.Zip::open()returnsTRUEif the file was opened successfully, or an integer error code. Therefore,==will give you a false positive on an error because any non-zero integer will evaluate to true using==since it will cast a non-zero integer toTRUE. If you are going to check the return fromZip::openyou should always use$res === truein order to check for success. You can find the meanings of the error codes here in the comment at the bottom of the page.Bottom Line: Since you said you are already extracting the zip, it may be less of a bother to validate based on the mime type, but instead it would be easier to just attempt to open the file and go based on the return value of
open. If it returns true, you can figure the file is a valid zip (there could of course be errors later in the file, but they at least uploaded something resembling a zip file).Hope that helps you out.