Using Python 3.2 I was hoping to solve the below issue. My data consist of hundreds of rows (signifying a project) and 21 columns. The first of which is a unique project ID and the other 20 columns is the group of people, or person, that led the project. person_1 is always filled and if there is a name in person_3 that means 3 people are working together. If there is a name in person_18 that means 18 people are working together.
I have an excel spreadsheet that is setup the following way:
unique ID person_1 person _2 person_3 person_4 ... person_20
12 Tom Sally Mike
16 Joe Mike
5 Joe Sally
1 Sally Mike Tom
6 Sally Tom Mike
2 Jared Joe Mike John ... Carl
I want to do a few things:
1) Make a column that will give me a unique ‘Group Name’ which will be, using unique ID 1 as my example, Sally/Mike/Tom. So it will be the names separated by ‘/’.
2) How can I treat, from my example, Sally/Mike/Tom the same as Sally/Tom/Mike. Meaning, I would like another column that makes the group name in alphabetical order (no matter the actual permutation), still separated by ‘/’.
3) This question is similar to (2). However, I want the person listed in person_1 to matter. Meaning Joe/Tom/Mike is different from Tom/Joe/Mike but not different than Joe/Mike/Tom. So there will be another column that keeps person_1 at the start of the group name but alphabetizes person_2 through person_20 if applicable (i.e., if the project has more than 1 person on it).
Thanks for the help and suggestions
The previous answer gave a clear statement of method, but perhaps you are stuck on either the string processing or the csv processing. Both are demonstrated in the following code. The relevant string methods are
sortedandjoin.'/'.jointellsjointo use/as separator between joined items. The+operator between lists intnameandwriterowstatements concatenates the lists. Acsv.readeris an iterator that delivers one list per row, and acsv.writerconverts a list to a row and writes it out. You will want to add error testing to the file opens, etc. The data file used to test this code is shown after the code.File
xgroup.csvis shown next.Upon reading data as above, the program prints
Opened reader and writerandClosed reader and writer after 7 linesand produces output in filexgroup3.csvas shown next.Note, given a data line like
instead of
the program as above produces
instead of
If that’s a problem, filter out empty entries. For example, if
row=['5', 'Joe', 'Sally', '', '', '', '', ''], then'/'.join(row[1:])produces'Joe/Sally/////', while'/'.join(filter(lambda x: x, row[1:]))and'/'.join(x for x in row[1:] if x)and'/'.join(filter(len, row[1:]))produce'Joe/Sally'.