Here is what I am looking for:
I need to open a zip file of images and iterate through it’s contents. First of all, the zip container file has subdirectories and inside one “IDX” houses the images I need. I have no problem extracting the zip file contents to a directory. My zip files can be incredibly huge, as in GBs huge, and so I am hoping to be able to open the file and pull out the images as I iterate through them one at a time to process them.
After I am done I just close the zip file. These images are actually being housed in a database.
Does anyone have any idea how to do this with, hopefully, free tools or built-in api’s? This process will be done on a Windows machine.
Thanks!
SharpZipLib is a great tool for your requirements.
I have used it to process giant files within directories within giant nested zip files (meaning ZIP files within ZIP files), using streams. I was able to open a zip stream on top of a zip stream so that I could investigate the contents of the inner zip without having to extract the entire parent. You can then use a stream to peek at the content files, which may help you determine whether you want to extract it or not. It’s open-source.
EDIT: Directory handling in the library is not ideal. As I recall, it contains separate entries for some directories, while others are implied by the paths of the file entries.
Here’s an extract of the code I used to collect the actual file and folder names at a certain level (_startPath). Let me know if you’re interested in the whole wrapper class.