Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7966461
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T06:25:56+00:00 2026-06-04T06:25:56+00:00

I have 50,000,000 (integer, string) pairs in a text file. The integers are times

  • 0

I have 50,000,000 (integer, string) pairs in a text file. The integers are times in milliseconds, so are 13 digits long (e.g. 1337698339089).

The entries in the text file are like this:

1337698339089|blaasdasd
1337698339089|asdasdas
1337698338089|kasda

There can be identical entries.

I want to sort the entries on the integers (in ascending order) preserving any duplicate integers and preserving the (integer, string) pairs. The approach I have taken is leading to memory errors, and so I’m looking for alternative approaches.

My approach is something like this (using some pseudo-code):

// declare TreeMap to do the sorting
TreeMap<Double, String> sorted = new TreeMap<Double, String>();

// loop through entries in text file, and put each in the treemap:
for each entry (integer, string) in the text file:

   Random rand = new Random();
   double inc = 0.0;

   while (sorted.get(integer + inc) != null) {
       inc = rand.nextDouble();
   }

   sorted.put(integer + inc, string);

I am using random numbers here to ensure that duplicate integers can be entered in the treemap (by incrementing them by a double between 0 and 1).

// to print the sorted entries:
for (Double d : sorted.KeySet()) {
    System.out.println(Math.round(d) + "|" + sorted.get(d));
}

This approach works but breaks down for 50,000,000 entries (I think because the treemap is becoming too large; or possibly because the while loop is running for too long).

I would like to know what approach more experienced programmers would take.

Many thanks!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T06:25:59+00:00Added an answer on June 4, 2026 at 6:25 am

    You should be able to do this with a list, if you have enough memory. I would create a separate class for the entry:

    class Foo : Comparable<Foo> {
        private final long time;
        private final String text;
    
        // Constructor etc
    }
    

    In terms of memory, you need to be able to store 50 million instances, and references to them. On a 32-bit JVM, that would be:

    • 8 bytes of overhead per object (IIRC)
    • 8 bytes for the time
    • 4 bytes for the text field
    • ~54 bytes for the string (8 byte overhead + three int fields IIRC + char[] array reference + ~32 bytes for a 10 character array)
    • 4 bytes for the reference in the array or ArrayList

    So that’s about 80 bytes per instance – say 100 to round up. To store 50,000,000 of those would take 5,000,000,000 bytes, aka 5GB, which is more than I believe a 32-bit JVM will cope with.

    So to do all this in memory, you’ll need a 64-bit machine and 64-bit JVM, and then the overhead potentially increases somewhat due to larger references etc. Feasible, but not terribly pleasant.

    A large part of this is due to the strings, however. If you really wanted to be efficient, you could create a giant char array, then store offsets into it within Foo. Read into the array as you read the text data, and then use it to write out the data after sorting. More complex, and ugly, but considerably more memory-efficient.

    Alternatively, you could do this not all in memory – I’m sure if you search around you’ll find lots of information about sorting via the file system.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

In Java I have a long integer representing a period of time in milliseconds.
I have a 2 column table in a database 1,000 rows long(All integer data).
I have like 12,000 rows in a DataTable and it is timing out when
im planning to create a movie file that might have over 16,000 frames?i know
In short, I have a 20,000,000 line csv file that has different row lengths.
I have many large (>35,000,000) lists of integers that will contain duplicates. I need
I have a property text=000.01 but using java.util.Properties 's method getProperty("text"); returns "0.01" (must
I'm taking in a text file with around 60,000 lines of point coordinates, (I
You have an ASCII string representing a 128-bit unsigned integer number n, i.e. 0
I have a text file that contains 113 line and 10 columns. Every line

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.