I have insane big directory. I need to get filelist via python.
In code i need to get iterator, not list. So this not work:
os.listdir
glob.glob (uses listdir!)
os.walk
I cant find any good lib. help! Maybe c++ lib?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
If you have a directory that is too big for libc readdir() to read it quickly, you probably want to look at the kernel call getdents() (http://www.kernel.org/doc/man-pages/online/pages/man2/getdents.2.html ). I ran into a similar problem and wrote a long blog post about it.
http://www.olark.com/spw/2011/08/you-can-list-a-directory-with-8-million-files-but-not-with-ls/
Basically, readdir() only reads 32K of directory entries at a time, and so if you have a lot of files in a directory, readdir() will take a very long time to complete.