Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8164985
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 6, 20262026-06-06T19:26:07+00:00 2026-06-06T19:26:07+00:00

How would I go about this, I have files which I have sorted the

  • 0

How would I go about this, I have files which I have sorted the information in, I want to compare a certain index in that file with an index in another, one problem is that the files are enormously large, millions of lines. I want to compare line by line the files I have, if they match I want to input both those values along with other values using an index method.

=======================

Let me clarify, I want to take say line[x] the x will remain the same as it is formatted uniformly, I want to run line[x] against line[y] in another file, I want to do this to the whole file and output every matching pair to another file. In that other file I also want to be able to include other pieces from the first file which would be like just adding more indexes such as; line[a],line[b],line[c],line[d], and finally line[y] as the match to that information.

Try 3:

I have a file with information in this format:

#x is a line

 x= data,data,data,data,data,data

there is millions of lines of that.

I have another file, same format:

    xis a line
    x= data,data,data,data

I want to use x[#] from first file and x[#] from second file, I want to see if those two values match, if they do I want to output those, along with several other x[#] values from the second file, which are on the same line.

Did that help at all to understand?
The format the files are in are like i said:(but there is millions, and I want to find the pairs in the two files because they all should match up)

  line 1  data,data,data,data
  line 2  data,data,data,data

data from file 1:

 (N'068D556A1A665123A6DD2073A36C1CAF', N'A76EEAF6D310D4FD2F0BD610FAC02C04DFE6EB67',    
N'D7C970DFE09687F1732C568AE1CFF9235B2CBB3673EA98DAA8E4507CC8B9A881');

data from file 2:

00000040f2213a27ff74019b8bf3cfd1|index.docbook|Redhat 7.3 (32bit)|Linux
00000040f69413a27ff7401b8bf3cfd1|index.docbook|Redhat 8.0 (32bit)|Linux
00000965b3f00c92a18b2b31e75d702c|Localizable.strings|Mac OS X 10.4|OSX
0000162d57845b6512e87db4473c58ea|SYSTEM|Windows 7 Home Premium (32bit)|Windows
000011b20f3cefd491dbc4eff949cf45|totem.devhelp|Linux Ubuntu Desktop 9.10 (32bit)|Linux

The order it is sorted in is alphanumeric, and I want to use a slider method. By that I mean if file1[x] is < file2[x] move the slider down or up depending on whether one value is greater than the other, until a match is found, when and if so, print the output along with other values that will identify that hash.

What I want as a result would be:

file1[x] and its corresponding match on file2[x] outputted to a file, as well as other file1[x] where x can be any index from the line.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-06T19:26:08+00:00Added an answer on June 6, 2026 at 7:26 pm

    What I got from the clarification:

    • file1 and file2 are in the same format, where each line looks like

      {32 char hex key}|{text1}|{text2}|{text3}
      
    • the files are sorted in ascending order by key

    • for each key that appears in both file1 and file2, you want merged output, so each line looks like

      {32 char hex key}|{text11}|{text12}|{text13}|{text21}|{text22}|{text23}
      

    You basically want the collisions from a merge sort:

    import csv
    
    def getnext(csvfile, key=lambda row: int(row[0], 16)):
        row = csvfile.next()
        return key(row),row
    
    with open('file1.dat','rb') as inf1, open('file2.dat','rb') as inf2, open('merged.dat','wb') as outf:
        a = csv.reader(inf1, delimiter='|')
        b = csv.reader(inf2, delimiter='|')
        res = csv.writer(outf, delimiter='|')
    
        a_key, b_key = -1, 0
        try:
            while True:
                while a_key < b_key:
                    a_key, a_row = getnext(a)
                while b_key < a_key:
                    b_key, b_row = getnext(b)
                if a_key==b_key:
                    res.writerow(a_row + b_row[1:])
        except StopIteration:
            # reached the end of an input file
            pass
    

    I still have no idea what you are trying to communicate by ‘as well as other file1[x] where x can be any index from the line’.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Just wondering about this... I have several separate javascript files, that all contain code
How would I go about this? I have 3 rows like so: ID THREAD
Just a quick question about how you would go about implementing this. I want
I have about 350 text files which comprise the entire contents of 5 folders
I have a .txt file which has about 500k entries, each separated by new
Hey all. So I have a collection of csv files which I would like
I have a directory with about 4500 XML (HTML5) files, and I want to
Hello all I have some very important system files which I want to protect
I have a 1.3GB text file that I need to extract some information from
I have a 300MB file that looks like this: Item Item Item Item Item2

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.