Streaming ZIP files in PHP

I ran into an interesting bug in OSX the other day. Well, I regard it as a bug, anyway - Apple may feel differently, as it's been there for a while.

I was using a PHP library called ZipStream. It basically lets you create a zip archive on the fly and stream it to the client as each files is compressed, as oppsed to generating the entire on the server and then sending it to the client. This is nice because the user doesn't have to have a (potentially) long delay before the download starts - you can start sending data right away.

Anyway, the library was working wonderfully...until I tested the archive on my MacBook. Turns out that OSX doesn't like the archives that ZipStream generates. Or, rather, the OSX Archive Utility doesn't like them. When you double-click the archive in Finder, rather than properly decompressing it, OSX extracts it into a .cpgz file. And if you double-click that file, it extracts into another archive, and so on ad infinitum.

By way of contrast, everything else seems to be able to extract the archive normally. The Windows zip archive handles manages it fine, as does WinRAR and 7-zip; on OSX, Safari's built-in zip handling transparently decompresses it without problems; even the OSX command-line "unzip" program handles it without problems. It's just the Archive Utility - which is, unfortunately, the default handler in Finder.

Luckily, the solution is pretty simple. It turns out that the OSX archive tool doesn't like the "version needed to extract" set by ZipStream. The value set for ZipStream's archives is 0x0603. If you change that to 0x000A, then the OSX Archive Utility will open the file normally, just like every other program. Of course, you have to modify ZipStream itself to get this to work, but that's not really a big deal - it's just a one-line change.

I'm not entirely sure why the OSX archiver doesn't like that version number. Perhaps that flag implies some other features that the Archive Utility doesn't support. Or maybe it requires additional metadata which wasn't set in the archive, and so it was technically out of spec. But to me, it really doesn't matter - either way, it's a bug in the OSX archiver. If the zip file was out of spec, they should just detect that and handle it, because everybody else does. And if the program doesn't support other features implied by that version number, then they should have either implemented those features (I mean, it's not like the ZIP format is new or anything) or they should have done the check based on the archive content rather than version number - if the file doesn't use any of the unsupported features, then there's no reason that the archiver shouldn't handle it correctly.