Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8773071
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T18:12:02+00:00 2026-06-13T18:12:02+00:00

When I seek to some position in a file and write a small amount

  • 0

When I seek to some position in a file and write a small amount of data (20 bytes), what goes on behind the scenes?

My understanding

To my knowledge, the smallest unit of data that can be written or read from a disk is one sector (traditionally 512 bytes, but that standard is now changing). That means to write 20 bytes I need to read a whole sector, modify some of it in memory and write it back to disk.

This is what I expect to be happening in unbuffered I/O. I also expect buffered I/O to do roughly the same thing, but be clever about its cache. So I would have thought that if I blow locality out the window by doing random seeks and writes, both buffered and unbuffered I/O ought to have similar performance… maybe with unbuffered coming out slightly better.

Then again, I know it’s crazy for buffered I/O to only buffer one sector, so I might also expect it to perform terribly.

My application

I am storing values gathered by a SCADA device driver that receives remote telemetry for upwards of a hundred thousand points. There is extra data in the file such that each record is 40 bytes, but only 20 bytes of that needs to be written during an update.

Pre-implementation benchmark

To check that I don’t need to dream up some brilliantly over-engineered solution, I have run a test using a few million random records written to a file that could contain a total of 200,000 records. Each test seeds the random number generator with the same value to be fair. First I erase the file and pad it to the total length (about 7.6 meg), then loop a few million times, passing a random file offset and some data to one of two test functions:

void WriteOldSchool( void *context, long offset, Data *data )
{
    int fd = (int)context;
    lseek( fd, offset, SEEK_SET );
    write( fd, (void*)data, sizeof(Data) );
}

void WriteStandard( void *context, long offset, Data *data )
{
    FILE *fp = (FILE*)context;
    fseek( fp, offset, SEEK_SET );
    fwrite( (void*)data, sizeof(Data), 1, fp );
    fflush(fp);
}

Maybe no surprises?

The OldSchool method came out on top – by a lot. It was over 6 times faster (1.48 million versus 232000 records per second). To make sure I hadn’t run into hardware caching, I expanded my database size to 20 million records (file size of 763 meg) and got the same results.

Before you point out the obvious call to fflush, let me say that removing it had no effect. I imagine this is because the cache must be committed when I seek sufficiently far away, which is what I’m doing most of the time.

So, what’s going on?

It seems to me that the buffered I/O must be reading (and possibly writing all of) a large chunk of the file whenever I try to write. Because I am hardly ever taking advantage of its cache, this is extremely wasteful.

In addition (and I don’t know the details of hardware caching on disk), if the buffered I/O is trying to write a bunch of sectors when I change only one, that would reduce the effectiveness of the hardware cache.

Are there any disk experts out there who can comment and explain this better than my experimental findings? =)

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T18:12:03+00:00Added an answer on June 13, 2026 at 6:12 pm

    Indeed, at least on my system with GNU libc, it looks like stdio is reading 4kB blocks before writing back the changed portion. Seems bogus to me, but I imagine somebody thought it was a good idea at the time.

    I checked by writing a trivial C program to open a file, write a small of data once, and exit; then ran it under strace, to see which syscalls it actually triggered. Writing at an offset of 10000, I saw these syscalls:

    lseek(3, 8192, SEEK_SET)                = 8192
    read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1808) = 1808
    write(3, "hello", 5)                    = 5
    

    Seems that you’ll want to stick with the low-level Unix-style I/O for this project, eh?

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm trying to append some data to a file, but in some cases want
I have some binary data which has a corresponding map file which identifies each
So I have some code that reads a certain amount of bytes from a
Let's assume that on a hard drive I have some very large data file
I would like seek some guidance in writing a process profiler which runs in
I have a large raw data file (up to 1GB) which contains raw samples
I seek some full featured async http client (using NIO, not thread per connection),
I seek some insight in creating an application that converts text to speech in
i am a newbie to the WebService area, and seek some help. I have
When I use open() to open a file, I am not able to write

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.