I am trying to create several new lists from one master list whereby the new lists contain similar items from the master list. Specifically, I have a list of bus routes. Here is a sample data set:
[u'Bus04_00_00_IB_pts_Line', u'Bus04_00_00_OB_pts_Line', u'Bus15_00_00_IB_pts_Line', u'Bus15_00_00_OB_pts_Line']
Most bus routes have an inbound (IB) and an outbound (OB) item, (and some have multiple IBs and OBs, and some have only one route, b/c they are loop routes). Eventually, I want to merge the IB and OB routes in mapping software (which I already know how to do)…
I originally created the filenames so that the first 5 characters represent the bus route, whether or not it’s IB or OB. Therefore, I am able to group similar items based on the first 5 characters. For example, when I write:
for route in routes:
print route[0:5]
I get:
>>>
Bus04
Bus04
Bus15
Bus15
How can I “group” the files that pertain to Bus04 and Bus04, and Bus15 and Bus15 into new lists, such that I get:
[u'Bus04_00_00_IB_pts_Line', u'Bus04_00_00_OB_pts_Line'] and [u'Bus15_00_00_IB_pts_Line', u'Bus15_00_00_OB_pts_Line'] as separate lists?
I am thinking something along the lines of looping through each item, looking at the first five characters of each, then either create a new list with each new five character item that comes up (and add that item to the new list) or checking whether a list already exists and appending the similar item to it.
I’m having a hard time writing this out in code, so any help is greatly appreciated!
I would use
collections.defaultdictfor this:This produces:
Unlike some of the other solutions proposed thus far, this works irrespective of the order in which entries appear in the input list.