I’m trying to split a 2D array into a specific format and can’t figure

Question

0

Asked: May 24, 20262026-05-24T05:11:13+00:00 2026-05-24T05:11:13+00:00

I’m trying to split a 2D array into a specific format and can’t figure

0

I’m trying to split a 2D array into a specific format and can’t figure out the last step. A sample of my data is structured as follows:

# Original Data
fileListCode = [['Seq3.xls', 'B08524_057'], 
                ['Seq3.xls', 'B08524_053'], 
                ['Seq3.xls', 'B08524_054'],
                ['Seq98.xls', 'B25034_001'], 
                ['Seq98.xls', 'D25034_002'], 
                ['Seq98.xls', 'B25034_003']]

I am trying to split it up so that it looks like this:

# split into [['Seq3.xls', {'B08524_057':1,'B08524_053':2, 'B08524_054':3},
#             ['Seq98.xls',{'B25034_001':1,'D25034_002':2, 'B25034_003':3}]

The dictionary keys 1,2,3 are based on the original position of the entry, starting from the first time that the filename appears. To do this, I’ve first made an array to get all the unique file names (anything that is .xls is a filename)

tmpFileList = []
tmpCodeList = []
arrayListDict = []

# store unique filelist in a tempprary array:
for i in range( len(fileListCode)):
    if fileListCode[i][0] not in tmpFileList:
        tmpFileList.append( fileListCode[i][0]  )

However, I’m struggling with the next step. I can’t figure out a good way of pulling out the codenames (B08524_052 for example), and converting them into a dictionary with an index based on their position.

# make array to store filelist, and codes with dictionary values
for i in range( len(tmpFileList)):
    arrayListDict.append([tmpFileList[i], {}])

This code just produces [['Seq3.xls', {}], ['Seq98.xls', {}]] ; I’m not sure whether I should first produce the structure and then try and add the code and dictionary values in, or whether there is a better way.

—
EDIT: I just made sample a little more clear by changing the values in fileListCode

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T05:11:13+00:00

With, itertools.groupby this process will be much simplier:

>>> key = operator.itemgetter(0)
>>> grouped = itertools.groupby(sorted(fileListCode, key=key), key=key)
>>> [(i, {k[1]: n for n, k in enumerate(j, 1)}) for i, j in grouped]
[('Seq3.xls', {'B08524_052': 1, 'B08524_053': 2, 'B08524_054': 3}),
 ('Seq98.xls', {'B25034_001': 1, 'B25034_002': 2, 'B25034_003': 3})]

For old Python versions:

>>> [(i, dict((k[1], n) for n, k in enumerate(j, 1))) for i, j in grouped]
[('Seq3.xls', {'B08524_052': 1, 'B08524_053': 2, 'B08524_054': 3}),
 ('Seq98.xls', {'B25034_001': 1, 'B25034_002': 2, 'B25034_003': 3})]

But I think using dict would be better:

>>> {i: {k[1]: n for n, k in enumerate(j, 1)} for i, j in grouped}
{'Seq3.xls': {'B08524_052': 1, 'B08524_053': 2, 'B08524_054': 3},
 'Seq98.xls': {'B25034_001': 1, 'B25034_002': 2, 'B25034_003': 3}}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to split a 2D array into a specific format and can’t figure

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply