Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7572165
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 30, 20262026-05-30T15:51:24+00:00 2026-05-30T15:51:24+00:00

I have two CSV files (three columns) which I need to compare and extract

  • 0

I have two CSV files (three columns) which I need to compare and extract rows from other file (five columns) that matches. The example for files are:

File1:

ATGCGCGACAGT, ch3, 123546
ATGCATACAGGATAT, ch2, 5141561615

……so on approx 100 entries

File2:

ATGCGGCGACAGT,ch3, 123456,mi141515, AUCAGCUAUAUAU, UACGCAGAUAUAUA
ATCAGACGATTATGA, ch4, 4564764, mi653453, AUCAGCAAUUUUCG, AUACAGACAAAAA

….so on approx 50000 entries

I need to match the column 1,2 and 3 for both the files in such a way that all three columns of file1 should match with file2. If so happens than extract 4,5 and 6 columns for further processing.

I was thinking of:

fhout=csv.writer(open('parsed_out', 'w'), delimiter=',')

for i in file1:

     a=[0]
     b=[1]
     c=[2]
      for x in file2:
       d=[0]
       e=[1]
       f=[2]
       g=[3]
       h=[4]
       i=[5]
         if a==d and b==e and c==f:
           fhout.writerow([g]+[h]+[i])
         else:
           pass

But somebody told me that I can use hashing or some better way rather writing such big loops for 10,000 or more entries in file1

Please suggest me better way to achieve this. Both file 1 and file 2 are parsed from more complex files.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-30T15:51:25+00:00Added an answer on May 30, 2026 at 3:51 pm

    Try something like:

    import csv
    
    file_1_tuples = []
    
    with open("file_1.csv") as fh:
        csv_reader = csv.reader ( fh )
        for row in csv_reader:
            file_1_tuples.append(  tuple(row)  )
    
    with open("file_2.csv") as fh:
        csv_reader = csv.reader ( fh )
        for row in csv_reader:
            if tuple(row[0:3]) in file_1_tuples:
                print ( row[3:6] )
    

    When run with the following data:

    file_1.csv

    person, john, smith
    person, anne, frank
    person, bob, macdonald
    fruit, orange, banana
    fruit, strawberry, fields
    fruit, ringring, banana
    

    file_2.csv

    person, john, smith, 1, 2, 3
    person, anne, frank, 4, 5, 6
    person, bob, macdonald, 7, 8, 9
    

    it produces the output

    [' 1', ' 2', ' 3']
    [' 4', ' 5', ' 6']
    [' 7', ' 8', ' 9']
    

    EDIT: A slightly nicer implementation using sets and list comprehensions:

    import csv, pprint
    
    with open("file_1.csv") as fh:
        csv_reader = csv.reader ( fh )
        file_1_tuples = { tuple(row) for row in csv_reader }
    
    with open("file_2.csv") as fh:
        csv_reader = csv.reader ( fh )
        matched_rows = [ row for row in csv_reader if (tuple(row[:3]) in file_1_tuples)]
    
    pprint.pprint (matched_rows)
    

    EDIT 2: Note that this implementation is sensitive to the whitespace within the CSV file. If the spacing in your CSV file is inconsistent, use something like row = [element.strip(' ') for element in row] to strip out all the spaces.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have n csv files which I need to compare against each other and
I have two DataTables, A and B , produced from CSV files. I need
I have two arrays of System.Data.DataRow objects which I want to compare. The rows
I have two identical tables and need to copy rows from table to another.
I'm looking to compare two big sets of csv files and/or a csv file
I have a csv file as this: 1#one#two#three#four; 2#apple#tower#flower#robot; I read this file with
I have a CSV file with three columns (A,B,C). I can record a Macro
I have two csv file. First File has date offerid clicks orders Second File
I have this type of file: @firstTablel: 1#one#two#three#four; 2#apple#tower#flower#robot; this is an example of
I have csv file with 14 columns and I want to sort it in

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.