The Background:
I have been creating a script that based on input csv’s that exist within an input directory, creates a 3 dimensional array to store the aggregated information. Each table within the array represents one of the pollution sources (eg one of the input csvs was Incinerators.csv, the created table will be the aggregated information about various pollutants released by Incinerators on a watershed scale), each row represents the aggregated information by watershed – row 0 = headers, and each column is the amount of and toxic equivalent of each substance – col 0 = watershed ID.
For each substance in each watershed, the total released by all sources is calculated and stored in another array with the exact same layout addressable using totals[wsid][substance] by index or name based dictionary lookups.
The Question:
With this table of totals, I need to calculate each watershed’s relative rank for the amount of each substance released compared to what is released in other watersheds.
I could use a couple of nested loops to go through each substance column and convert this into a list, sort the list, and then relate this back to the watershed ID… but this would not be a very clean solution. Zero values also need to be omitted from ranking and duplicate values should be given the same rank while decreasing total number being ranked.
Is there a smarter way to do this? Or a module where this is already implemented? (didn’t see anything evident in pyTables)
One of the requirements is that the solution also remain simple enough so that those with even less python experience than I will at least be able to understand the process. I can use up to 2.7.1
The End Goal:
Generate HTML pages to be iframed from a Google Earth description bubble with the results. I have put a couple entirely unfinished sample outputs here.
For this I have created 2 functions
And