I have two lists in files: File 1 has 200000 rows and looks like

Question

0

Asked: June 17, 20262026-06-17T14:05:25+00:00 2026-06-17T14:05:25+00:00

I have two lists in files: File 1 has 200000 rows and looks like

0

I have two lists in files:
File 1 has 200000 rows and looks like

MAP2K4  FLNC
MYPN    ACTN2
ACVR1   FNTA
UGT2A1  HPGDS
RPA2    STAT3
ARF1    GGA3
ARF3    ARFIP2
ARF3    ARFIP1
AKR1A1  EXOSC4
RPA2    GAS7
APP APPBP2
APLP1   DAB1
CITED2  TFAP2A
EP300   TFAP2A
APOB    MTTP
ARRB2   RALGDS
ARRB2   ZNF807

File 2 has 700000 rows and looks like:

MAP2K4  FLNC
MAP2K4  rs10036867
MAP2K4  ACTN2
MAP2K4  TEP1
ACTN2   MYPN
UGT2A1  NDUFAF6
RPA2    rs10109257
RPA2    rs10151961
GAS7    RPA2
APOB    PDZRN4
APOB    BICD1
ARRB2   ZNF807
ARRB2   FAM107B

I need to get the matching rows between these two lists despite the order of the elements. For instance in the above example it should look like:

MAP2K4  FLNC
ACTN2   MYPN
RPA2    GAS7
ARRB2   ZNF807

I wrote the following, but this seems to take forever!

col0_file1 = []
col1_file1 = []
col0_file2 = []
col1_file2 = []
with open('File1') as f1, open('File2') as f2:
    for line in f1:
        col0,col1 = line.split()
        col0_file1.append(col0)
        col1_file1.append(col1)
    for line in f2:
        col0,col1 = line.split()
        col0_file2.append(col0)
        col1_file2.append(col1)

result = []
for x in range(len(col0_file1)):
    for i, j in map(None, col0_file2, col1_file2):
        if i == col0_file1[x] and j == col1_file1[x]:
            result.append([i,j])
        elif j == col0_file1[x] and i == col1_file[x]:
            result.append([i,j])

with open('matching', 'w') as out:
    for elem in result:
        out.write('{a} \n'.format(a = '\t'.join(elem)))

Any way I could simplify the complexity? or Better means of doing it?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T14:05:26+00:00

Editorial Team

2026-06-17T14:05:26+00:00Added an answer on June 17, 2026 at 2:05 pm

I say, make two sets and take an intersection:

with open('File1') as f1, open('File2') as f2:
    columns_a = set(tuple(sorted(l.split())) for l in f1)
    columns_b = set(tuple(sorted(l.split())) for l in f2)

with open('matching', 'w') as out:
    for elem in columns_a  & columns_b:
        out.write('{a} \n'.format(a = '\t'.join(elem)))

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have two lists in files: File 1 has 200000 rows and looks like

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply