Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7831181
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 2, 20262026-06-02T11:36:45+00:00 2026-06-02T11:36:45+00:00

My requirement is to write a never ending stream of incoming variable sized binary

  • 0

My requirement is to write a never ending stream of incoming variable sized binary messages to the file system. Messages of average size 2KB arrive at 1000 messages/sec. So in an hour, total number of messages would be 3600*1000*2 = 6.8 GB.
The main purpose of the messages is following
1. Archive them for auditing purposes
2. Provide a search interface

My questions are

  1. Is there an open source software that solves this problem
  2. What kind of errors can occur if the process writes in multiples of block size and the process crashes in middle of writing the block
  3. What kind of errors can occur, where application has written a block size, but file system has not flushed the data to the disk.
  4. can inodes get corrupted in any scenario
  5. is there a file size limitation in linux?
  6. Is there an ideal file size ? What are pros and cons of large file (in GB) vs medium file (in MB)
  7. Any other things to watch for?
  8. My preference is to use C++, but if needed, i can switch to C.
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-02T11:36:49+00:00Added an answer on June 2, 2026 at 11:36 am

    Once write or writev returns (i.e. the OS has accepted it), the operating system is responsible for writing data to disk. It’s not your problem any more, and it’s happening irrespectively of your process crashing. Note that you have no control over the exact amount of data accepted or actually written at a time, nor whether it happens in multiples of filesystem blocks or whether it’s any particular size at all. You send a request to write and it tells you how much it actually accepted, and it will write that to disk, at its own discretion.
    Probably this will happen in multiples of the block size because it makes sense for the OS to do that, but this is not guaranteed in any way (on many systems, Linux included, reading and writing is implemented via or tightly coupled with file mapping).

    The same “don’t have to care” guarantee holds for file mapping (with the theoretical exception that a crashing application could in principle still write into a still mapped area, but once you’ve unmapped an area, that cannot happen even theoretically). Unless you pull the plug (or the kernel crashes), data will be written, and consistently.
    Data will only ever be written in multiples of filesystem blocks, because memory pages are multiples of device blocks, and file mapping does not know anything else, it just works that way.

    You can kind of (neglecting any possible unbuffered on-disk write cache) get some control over what’s on the disk with fdatasync. When that function returns, what has been in the buffers before has been sent to the disk.
    However, that still doesn’t prevent your process from crashing in another thread in the mean time, and it doesn’t prevent someone from pulling the plug. fdatasync is preferrable over fsync since it doesn’t touch anything near the inode, meaning it’s faster and safer (you may lose the last data written in a subsequent crash since the length has not been updated yet, but you should never destroy/corrupt the whole file).

    C library functions (fwrite) do their own buffering and give you control over the amount of data you write, but having “written” data only means it is stored in a buffer owned by the C library (in your process). If the process dies, the data is gone. No control over how the data hits the disk, or if ever. (N.b.: You do have some control insofar as you can fflush, this will immediately pass the contents of the buffers to the underlying write function, most likely writev, before returning. With that, you’re back at the first paragraph.)

    Asynchronous IO (kernel aio) will bypass kernel buffers and usually pull the data directly from your process. Your process dies, your data is gone. Glibc aio uses threads that block on write, the same as in paragraph 1 applies.

    What happens if you pull the plug or hit the “off” switch at any time? Nobody knows.
    Usually some data will be lost, an operating system can give many guarantees, but it can’t do magic. Though in theory, you might have a system that buffers RAM with a battery or a system that has a huge dedicated disk cache which is also battery powered. Nobody can tell. In any case, plan for losing data.
    That said, what’s once written should not normally get corrupted if you keep appending to a file (though, really anything can happen, and “should not” does not mean a lot).

    All in all, using either write in append mode or file mapping should be good enough, they’re as good as you can get anyway. Other than sudden power loss, they’re reliable and efficient.
    If power failure is an issue, an UPS will give better guarantees than any software solution can provide.

    As for file sizes, I don’t see any reason to artificially limit file sizes (assuming a reasonably new filesystem). Usual file size limits for “standard” Linux filesystems (if there is any such thing) are in the terabyte range.
    Either way, if you feel uneasy with the idea that corrupting one file for whatever reason could destroy 30 days worth of data, start a new file once every day. It doesn’t cost extra.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a requirement to write HTML to the file system and I was
I have a requirement to write a stored procedure that accepts a start date,
I have a requirement to write FLAC files in java. Earlier I was writing
I got a requirement to write a log line when all wars have been
I have different requirement from a customer to write a automated test script for
I have got a requirement where I need to write a HTML/CSS, which should
In all projects I've done through the years I never came across a requirement
My Requirement is to write a sql query to get the sub-region wise (fault)events
Ha ii,I am doing a Bible application,my requirement is the user write a note
I need to write an application to grab event log for System/Applications. The other

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.