I am using the following function to get all file sizes in a system

Question

0

Asked: June 3, 20262026-06-03T14:39:36+00:00 2026-06-03T14:39:36+00:00

I am using the following function to get all file sizes in a system

0

I am using the following function to get all file sizes in a system from the target directory down.

def get_files(target):
    # Get file size and modified time for all files from the target directory and down.
    # Initialize files list
    filelist = []
    # Walk the directory structure
    for root, dirs, files in os.walk(target):
        # Do not walk into directories that are mount points
        dirs[:] = filter(lambda dir: not os.path.ismount(os.path.join(root, dir)), dirs)
        for name in files:
            # Construct absolute path for files
            filename = os.path.join(root, name)
            # Test the path to account for broken symlinks
            if os.path.exists(filename):
                # File size information in bytes
                size = float(os.path.getsize(filename))
                # Get the modified time of the file
                mtime = os.path.getmtime(filename)
                # Create a tuple of filename, size, and modified time
                construct = filename, size, str(datetime.datetime.fromtimestamp(mtime))
                # Add the tuple to the master filelist
                filelist.append(construct)
    return(filelist)

How can I modify this to include a second list containing directories and the total size of the directories? I am trying to include this operation in one function to hopefully be more efficient than having to perform a second walk in a separate function to get the directory information and size.

The idea is to be able to report back with a sorted list of the top twenty largest files, and a second sorted list of the top ten largest directories.

Thanks for any suggestions you guys have.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T14:39:37+00:00

I output the directories in a dictionary instead of a list, but see if you like it:

def get_files(target):
    # Get file size and modified time for all files from the target directory and down.
    # Initialize files list
    filelist = []
    dirdict = {}
    # Walk the directory structure
    for root, dirs, files in os.walk(target):
        # Do not walk into directories that are mount points
        dirs[:] = filter(lambda dir: not os.path.ismount(os.path.join(root, dir)), dirs)
        for name in files:
            # Construct absolute path for files
            filename = os.path.join(root, name)
            # Test the path to account for broken symlinks
            if os.path.exists(filename):
                # File size information in bytes
                size = float(os.path.getsize(filename))
                # Get the modified time of the file
                mtime = os.path.getmtime(filename)
                # Create a tuple of filename, size, and modified time
                construct = filename, size, str(datetime.datetime.fromtimestamp(mtime))
                # Add the tuple to the master filelist
                filelist.append(construct)
                if root in dirdict.keys():
                    dirdict[root] += size
                else:
                    dirdict[root] = size
    return(filelist, dirdict)

If you want the dirdict as a list of tuples, just do this:

dirdict.items()

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am using the following function to get all file sizes in a system

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply