Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 940907
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T22:02:00+00:00 2026-05-15T22:02:00+00:00

I am trying to develop a Recursive Extractor. The problem is , it is

  • 0

I am trying to develop a Recursive Extractor. The problem is , it is Recursing Too Much (Evertime it found an archive type) and taking a performance hit.

So how can i improve below code?

My Idea 1:

Get the ‘Dict’ of direcories first , together with file types.Filetypes as Keys. Extract the file types. When an Archive is found Extract only that one. Then Regenerate Archive Dict again.

My Idea 2:

os.walk returns Generator. So is there something i can do with generators? I am new to Generators.

here is the current code :

import os, magic
m = magic.open( magic.MAGIC_NONE )
m.load()

archive_type = [ 'gzip compressed data',
        '7-zip archive data',
        'Zip archive data',
        'bzip2 compressed data',
        'tar archive',
        'POSIX tar archive',
        'POSIX tar archive (GNU)',
        'RAR archive data',
        'Microsoft Outlook email folder (>=2003)',
        'Microsoft Outlook email folder']

def extractRecursive( path ,archives):
    i=0
    for dirpath, dirnames, filenames in os.walk( path ):
        for f in filenames:
            fp = os.path.join( dirpath, f )
            i+=1
            print i
            file_type = m.file( fp ).split( "," )[0]
            if file_type in archives:
                arcExtract(fp,file_type,path,True)
                extractRecursive(path,archives)
    return "Done"



def arcExtract(file_path,file_type,extracted_path="/home/v3ss/Downloads/extracted",unlink=False):
    import subprocess,shlex


    if file_type in pst_types:
        cmd = "readpst -o  '%s' -S '%s'" % (extracted_path,file_path)
    else:
        cmd = "7z -y -r -o%s x '%s'" % (extracted_path,file_path)

    print cmd
    args= shlex.split(cmd)
    print args

    try:
        sp = subprocess.Popen( args, shell = False, stdout = subprocess.PIPE, stderr = subprocess.PIPE )
        out, err = sp.communicate()
        print out, err
        ret = sp.returncode
    except OSError:
        print "Error no %s  Message %s" % (OSError.errno,OSError.message)
        pass

    if ret == 0:
        if unlink==True:
            os.unlink(file_path)
        return "OK!"
    else:
        return "Failed"
if __name__ == '__main__':
    extractRecursive( 'Path/To/Archives' ,archive_type)
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T22:02:01+00:00Added an answer on May 15, 2026 at 10:02 pm

    If, as it appears, you want to extract the archive files to paths “above” the one they’re in, os.walk per se (in its normal top-down operation) can’t help you (because by the time you extract an archive into a certain directory x, os.walk may likely, though not necessarily, already considered directory x — so only by having os.walk look at the whole path over and over again can you get all contents). Except, I’m surprised your code ever terminates, since the archive-type files should keep getting found and extracted — I don’t see what can ever terminate the recursion. (To solve that it would suffice to keep a set of all the paths of archive-type files you’ve already extracted, to avoid considering them again when you meet them again).

    By far the best architecture, anyway, would be if arcExtract was to return a list of all the files it has extracted (specifically their destination paths) — then you could simply keep extending a list with all these extracted files during the os.walk loop (no recursion), and then keep looping just on the list (no need to keep asking the OS about files and directories, saving lots of time on that operation too) and producing a new similar list. No recursion, no redundancy of work. I imagine that readpst and 7z are able to supply such lists (maybe on their standard output or error, which you currently just display but don’t process) in some textual form that you could parse to make it into a list…?

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to develop a regex that will detect recursive template calls in an
Using java I am trying to develop a method using recursion to analyze a
I am trying develop a basic referrer system to my Django website, system will
Im trying to develop service that starts by bootup receiver if the wifi or
Im trying to develop my first ASP.NET MVC web app and have run into
im trying to develop an app for a win CE mobile device that downloads
I am trying to develop an application for my localhost on which I can
I am trying to develop an application to forward received SMS to a web
I was trying to develop a video management software for which I was evaluating
I am trying to develop the XMPP Gateway which can send/receive from standard XMPP

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.