Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3491416
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 18, 20262026-05-18T11:38:02+00:00 2026-05-18T11:38:02+00:00

I have two files: 1- with 1400000 line or record — 14 MB 2-

  • 0

I have two files:
1- with 1400000 line or record — 14 MB
2- with 16000000 — 170 MB

I want to find if each record or line in file 1 is also in file 2 or not

I develop a java app that do the following: Read file line by line and pass each line to a method that loop in file 2

Here is my code:

public boolean hasIDin(String bioid) throws Exception {

    BufferedReader br = new BufferedReader(new FileReader("C://AllIDs.txt"));
    long bid = Long.parseLong(bioid);
    String thisLine;
    while((thisLine = br.readLine( )) != null)
    {
         if (Long.parseLong(thisLine) == bid)
            return true;

    }
        return false;
    }

public void getMBD() throws Exception{

     BufferedReader br = new BufferedReader(new FileReader("C://DIDs.txt"));
     OutputStream os = new FileOutputStream("C://MBD.txt");
     PrintWriter pr = new PrintWriter(os);
     String thisLine;
     int count=1;
     while ((thisLine = br.readLine( )) != null){
         String bioid = thisLine;
         System.out.println(count);
         if(! hasIDin(bioid))
                pr.println(bioid);
     count++;
     }
    pr.close();
}  

When I run it seems it will take more 1944.44444444444 hours to complete as every line processing takes 5 sec. That is about three months!

Is there any ideas to make it done in a much much more less time.

Thanks in advance.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-18T11:38:03+00:00Added an answer on May 18, 2026 at 11:38 am

    Why don’t you;

    • read all the lines in file2 into a set. Set is fine, but TLongHashSet would be more efficient.
    • for each line in the second file see if it is in the set.

    Here is a tuned implementation which prints the following and uses < 64 MB.

    Generating 1400000 ids to /tmp/DID.txt
    Generating 16000000 ids to /tmp/AllIDs.txt
    Reading ids in /tmp/DID.txt
    Reading ids in /tmp/AllIDs.txt
    Took 8794 ms to find 294330 valid ids
    

    Code

    public static void main(String... args) throws IOException {
        generateFile("/tmp/DID.txt", 1400000);
        generateFile("/tmp/AllIDs.txt", 16000000);
    
        long start = System.currentTimeMillis();
        TLongHashSet did = readLongs("/tmp/DID.txt");
        TLongHashSet validIDS = readLongsUnion("/tmp/AllIDs.txt",did);
    
        long time = System.currentTimeMillis() - start;
        System.out.println("Took "+ time+" ms to find "+ validIDS.size()+" valid ids");
    }
    
    private static TLongHashSet readLongs(String filename) throws IOException {
        System.out.println("Reading ids in "+filename);
        BufferedReader br = new BufferedReader(new FileReader(filename), 128*1024);
        TLongHashSet ids = new TLongHashSet();
        for(String line; (line = br.readLine())!=null;)
            ids.add(Long.parseLong(line));
        br.close();
        return ids;
    }
    
    private static TLongHashSet readLongsUnion(String filename, TLongHashSet validSet) throws IOException {
        System.out.println("Reading ids in "+filename);
        BufferedReader br = new BufferedReader(new FileReader(filename), 128*1024);
        TLongHashSet ids = new TLongHashSet();
        for(String line; (line = br.readLine())!=null;) {
            long val = Long.parseLong(line);
            if (validSet.contains(val))
                ids.add(val);
        }
        br.close();
        return ids;
    }
    
    private static void generateFile(String filename, int number) throws IOException {
        System.out.println("Generating "+number+" ids to "+filename);
        PrintWriter pw = new PrintWriter(new BufferedWriter(new FileWriter(filename), 128*1024));
        Random rand = new Random();
        for(int i=0;i<number;i++)
            pw.println(rand.nextInt(1<<26));
        pw.close();
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I will have two files. One header.h file and second one is main.c file.
I have two excel files. First excel file contains the Person Name and Total
I Have two files. One extends JFrame, and another Extends JPanel. Whenever I change
I've created a dotnetnuke. I have two files named index.ascx and index.html that as
Using VB.NET in Visual Studio 2010, I have two files: test2.aspx and test2.aspx.vb. The
I have two xml files that both have the same schema and I would
I have two different files, Foo1.exe and Foo2.dll. When I try to use an
I have two aspx files for adding/deleting products to/from a sale. islemler.aspx SatisTedarik.aspx I
I have some constants in JavaScript that I'd like to reuse in several files
I'm pretty new to Android but I have some experience (and a bit rusty

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.