I am writing a script to list the 20 largest files in a target directory. Once I have the files, I perform some math on the size to apply the correct human readable sizing information, i.e., Kb, Mb, Gb.
This however is getting the sort out of order. How can I do this, and keep the sort order intact?
#! /usr/bin/env python
import operator, os, sys
args = sys.argv
if len(args) != 2:
print "You must one enter one directory as an argument."
sys.exit(1)
else:
target = args[1]
data = {}
for root, dirs, files in os.walk(target):
for name in files:
filename = os.path.join(root, name)
if os.path.exists(filename):
size = float(os.path.getsize(filename))
data[filename] = size
sorted_data = sorted(data.iteritems(), key=operator.itemgetter(1), reverse=True)
total = str(len(sorted_data))
while len(sorted_data) > 20:
sorted_data.pop()
final_data = {}
for name in sorted_data:
size = str(name[1])
if size >= 1024:
size = round(float(size) / 1024, 2)
if size >= 1024:
size = round(size / 1024, 2)
if size >= 1024:
size = round(size / 1024, 2)
size = str(size) + "Gb"
else:
size = str(size) + "Mb"
else:
size = str(size) + "Kb"
final_data[name] = size
print "The 20 largest files are:\n"
for name in final_data:
print str(final_data[name]) + " " + str(name)
print "\nThere are a total of " + total + " files located in " + target
Your problem is that you create a brand new dictionary to store the modified filesize data. Because that dictionary doesn’t contain any information about the file sizes, and because dictionaries don’t store their information in any fixed order, you lose your sort order. But it’s simple to recover; simply iterate over the
sorted_datainstead of the over thefinal_data, usingfinal_datato access the human-readable file sizes. So something like this:But an even better solution would be to put your human-readable string generating code into a function!
Now you don’t even have to create a dictionary: