Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6132627
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T17:07:32+00:00 2026-05-23T17:07:32+00:00

I am in the process of writing an application that processes a huge number

  • 0

I am in the process of writing an application that processes a huge number of integers from a binary file (up to 50 meg). I need to do it as quickly as possible and the main performance issue is the disk access time, since I make a large number of reads from the disk, optimizing read time would improve performance of the app in general.

Up until now I thought that the fewer blocks I split my file into (i.e. the fewer reads I have / the larger the read size is) the faster my app should work. This is because HDD is very slow on seeking i.e. locating the beginning of the block due to its mechanical nature. However, once it locates the beginning of the block you asked it to read off it should perform the actual read fairly quickly.

Well, that was up until I ran this test:

Old test removed, had issues due to HDD Caching

NEW TEST (HDD Cache doesn’t help here since the file is too big (1gb) and I access random locations within it):

    int mega = 1024 * 1024;
    int giga = 1024 * 1024 * 1024;
    byte[] bigBlock = new byte[mega];
    int hundredKilo = mega / 10;
    byte[][] smallBlocks = new byte[10][hundredKilo];
    String location = "C:\\Users\\Vladimir\\Downloads\\boom.avi";
    RandomAccessFile raf;
    FileInputStream f;
    long start;
    long end;
    int position;
    java.util.Random rand = new java.util.Random();
    int bigBufferTotalReadTime = 0;
    int smallBufferTotalReadTime = 0;

    for (int j = 0; j < 100; j++)
    {
        position = rand.nextInt(giga);
        raf = new RandomAccessFile(location, "r");
        raf.seek((long) position);
        f = new FileInputStream(raf.getFD());
        start = System.currentTimeMillis();
        f.read(bigBlock);
        end = System.currentTimeMillis();
        bigBufferTotalReadTime += end - start;
        f.close();
    }

    for (int j = 0; j < 100; j++)
    {
        position = rand.nextInt(giga);
        raf = new RandomAccessFile(location, "r");
        raf.seek((long) position);
        f = new FileInputStream(raf.getFD());
        start = System.currentTimeMillis();
        for (int i = 0; i < 10; i++)
        {
            f.read(smallBlocks[i]);
        }
        end = System.currentTimeMillis();
        smallBufferTotalReadTime += end - start;
        f.close();
    }

    System.out.println("Average performance of small buffer: " + (smallBufferTotalReadTime / 100));
    System.out.println("Average performance of big buffer: " + (bigBufferTotalReadTime / 100));

RESULTS:
Average for small buffer – 35ms
Average for large buffer – 40ms ?!
(Tried on linux and windows, in both cases larger block size results in longer read time, why?)

After running this test for many many times I have realised that for some magical reason reading one big block takes on average longer than reading 10 blocks of smaller size sequentially. I thought that it might have been a result of Windows being too smart and trying to optimize something in its file system, so I ran the same code on Linux and to my surprise I got the same result.

I have no clue as to why this is happening, could anyone please give me a hint? Also what would be the best block size in this case?

Kind Regards

  • 1 1 Answer
  • 1 View
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T17:07:32+00:00Added an answer on May 23, 2026 at 5:07 pm

    After you read the data the first time, the data will be in disk cache. The second read should be much faster. You need to run the test you think is faster first. 😉

    If you have 50 MB of memory, you should be able to read the entire file at once.


    package com.google.code.java.core.files;
    
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.nio.ByteBuffer;
    import java.nio.channels.FileChannel;
    
    public class FileReadingMain {
        public static void main(String... args) throws IOException {
            File temp = File.createTempFile("deleteme", "zeros");
            FileOutputStream fos = new FileOutputStream(temp);
            fos.write(new byte[50 * 1024 * 1024]);
            fos.close();
    
            for (int i = 0; i < 3; i++)
                for (int blockSize = 1024 * 1024; blockSize >= 512; blockSize /= 2) {
                    readFileNIO(temp, blockSize);
                    readFile(temp, blockSize);
                }
        }
    
        private static void readFile(File temp, int blockSize) throws IOException {
            long start = System.nanoTime();
            byte[] bytes = new byte[blockSize];
            int r;
            for (r = 0; System.nanoTime() - start < 2e9; r++) {
                FileInputStream fis = new FileInputStream(temp);
                while (fis.read(bytes) > 0) ;
                fis.close();
            }
            long time = System.nanoTime() - start;
            System.out.printf("IO: Reading took %.3f ms using %,d byte blocks%n", time / r / 1e6, blockSize);
        }
    
        private static void readFileNIO(File temp, int blockSize) throws IOException {
            long start = System.nanoTime();
            ByteBuffer bytes = ByteBuffer.allocateDirect(blockSize);
            int r;
            for (r = 0; System.nanoTime() - start < 2e9; r++) {
                FileChannel fc = new FileInputStream(temp).getChannel();
                while (fc.read(bytes) > 0) {
                    bytes.clear();
                }
                fc.close();
            }
            long time = System.nanoTime() - start;
            System.out.printf("NIO: Reading took %.3f ms using %,d byte blocks%n", time / r / 1e6, blockSize);
        }
    }
    

    On my laptop prints

    NIO: Reading took 57.255 ms using 1,048,576 byte blocks
    IO: Reading took 112.943 ms using 1,048,576 byte blocks
    NIO: Reading took 48.860 ms using 524,288 byte blocks
    IO: Reading took 78.002 ms using 524,288 byte blocks
    NIO: Reading took 41.474 ms using 262,144 byte blocks
    IO: Reading took 61.744 ms using 262,144 byte blocks
    NIO: Reading took 41.336 ms using 131,072 byte blocks
    IO: Reading took 56.264 ms using 131,072 byte blocks
    NIO: Reading took 42.184 ms using 65,536 byte blocks
    IO: Reading took 64.700 ms using 65,536 byte blocks
    NIO: Reading took 41.595 ms using 32,768 byte blocks <= fastest for NIO
    IO: Reading took 49.385 ms using 32,768 byte blocks <= fastest for IO
    NIO: Reading took 49.676 ms using 16,384 byte blocks
    IO: Reading took 59.731 ms using 16,384 byte blocks
    NIO: Reading took 55.596 ms using 8,192 byte blocks
    IO: Reading took 74.191 ms using 8,192 byte blocks
    NIO: Reading took 77.148 ms using 4,096 byte blocks
    IO: Reading took 84.943 ms using 4,096 byte blocks
    NIO: Reading took 104.242 ms using 2,048 byte blocks
    IO: Reading took 112.768 ms using 2,048 byte blocks
    NIO: Reading took 177.214 ms using 1,024 byte blocks
    IO: Reading took 185.006 ms using 1,024 byte blocks
    NIO: Reading took 303.164 ms using 512 byte blocks
    IO: Reading took 316.487 ms using 512 byte blocks
    

    It appears that the optimal read size may be 32KB. Note: as the file is entirely in disk cache this may not be the optimal size for a file which is read from disk.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm writing an Windows Forms application that reads a file, processes it and then
I'm writing a C# app to read a file that another application holds open.
Dear StackOverflowers, I am in the process of writing an application that sorts a
I am writing an application that works with the file system. When the app
I'm writing a basic GUI application that essentially invokes other processes given some parameters,
My colleague and I have dispute. We are writing a .NET application that processes
I'm writing a .NET MVC application that needs two background processes to be running
I am writing a ruby on rails application that has large file uploads. (20-100MB).
I'm in the process of writing an application to suggest circular routes over OpenStreetMap
I'm currently in the process of writing a steganography application with Qt. I am

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.