Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7402601
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 29, 20262026-05-29T04:46:56+00:00 2026-05-29T04:46:56+00:00

Background A lot of work has gone into optimizing database design, especially in the

  • 0

Background

A lot of work has gone into optimizing database design, especially in the realm of the most optimal ways to read and write data from disks (both spindle and SSD).

The knowledge that has come out of the work suggests that reading and writing on block boundaries, matching the block sizes of the filesystem you are running on, is the most optimal approach.

Question

Say I am operating in a relatively low-memory environment and want to use a small 32MB memory-mapped file to read and write the contents of a huge 500GB file.

If I were using Java’s NIO mechanisms, specifically the MappedByteBuffer (Java’s memory-mapped file mechanism), would I need to take care to execute READ and WRITE operations on block boundaries (e.g. 4KB) into memory before pairing out the data I needed, or can I just issue R/W ops at any location I want and allow the operating system, VM paging logic, filesystem and storage firmware handle the optimization of the operations and culling of additional block data I didn’t need as-needed?

Additional Detail

The reason for the question is in database design, I see this obsessive focus on block-optimization to the point that there doesn’t seem to exist a world where you would ever just read and write data without the concept of a block.

What confuses me is that the filesystem is the one enforcing the block units of operation, why would my higher level app need to worry about this then? If I want the 17,631 bytes at offset 71, can’t I just grab them and read them in, or is it really faster for me to figure out that
the read operation starts at block 0 and falls across the boundaries of blocks 0, 1 and 2… read all of those 3 blocks in to an internal byte[], then cull out the 17,631 bytes I wanted in the first place?

If the literature on DB design wasn’t so religious about this block idea, the question would have never come up in my mind, but because it is, I am wondering if I am missing a critical detail here WRT filesystems and optimal block device I/O.

Thank you for reading.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-29T04:46:59+00:00Added an answer on May 29, 2026 at 4:46 am

    I think part of the reason databases have awareness of a block size (which may not be exactly the same as the fs block size, but of course should align) is not just to perform block-aligned I/O, but also to manage how the disk data is cached in memory rather than just relying on the OS caching. Some databases bypass the OS filesystem cache completely, in fact. Having the database manage the cache sometimes allows greater intelligence as to how that cache is utilised, that the OS might not be able to provide.

    An rdbms will typically take account of the number of blocks that could be read/written during a query in order to compare different execution plans: and the possibilities for all the data to be fetched from the same block can be a useful optimisation to take note of.

    Most databases I’m familiar with have the concept of a block cache/buffer where some portion of the working set of the database lives. Managing a cache entirely made up of arbitrary extents could potentially be quite a bit harder to manage. Also many databases actually arrange their stored data as a sequence of blocks, so the I/O pattern grows out of that. Of course, this might simply be a legacy of databases originally written for platforms that didn’t have rich OS caching facilities…

    Trying to conclude this ramble with some sort of answer to your question… my feeling would be that reading from arbitrary extents within the mapped file and letting the OS deal with the extra slop should be fine. Performance-wise, it’s probably more important to try and let the OS do read-ahead: e.g. using the “advise” calls so the OS can start reading the next extent from disk while you process the current one. And, of course, a way to advise the OS to uncache extents you’ve finished with.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Background The QA department where I work has a lot of automated blackbox tests
Background: Customer X is a low-budget non-profit outfit that nonetheless has a lot of
The site I work on has a lot of images that contain text. This
Background I graduated 6 months ago, and my collaborative work in college has all
Background: I'm pulling all of the field names from a database into an array
First, a little background, because there is a lot of interaction going on: I'm
My background has been generally new technology demonstrators, which, well... demonstrate the latest technology
Background: I have a website that has been built with ASP.NET 2.0 and is
This question has been asked a lot but everywhere the answers fall short. I
Background: Using Facebook PHP SDK v 2.1.2 cookieSupport = true App on Facebook has

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.