Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8734965
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T09:56:59+00:00 2026-06-13T09:56:59+00:00

I’m writing an algorithm in C++ that scans a file with a sliding window,

  • 0

I’m writing an algorithm in C++ that scans a file with a “sliding window,” meaning it will scan bytes 0 to n, do something, then scan bytes 1 to n+1, do something, and so forth, until the end is reached.

My first algorithm was to read the first n bytes, do something, dump one byte, read a new byte, and repeat. This was very slow because to “ReadFile” from HDD one byte at a time was inefficient. (About 100kB/s)

My second algorithm involves reading a chunk of the file (perhaps n*1000 bytes, meaning the whole file if it’s not too large) into a buffer and reading individual bytes off the buffer. Now I get about 10MB/s (decent SSD + Core i5, 1.6GHz laptop).

My question: Do you have suggestions for even faster models?

edit: My big buffer (relative to the window size) is implemented as follows:
– for a rolling window of 5kB, the buffer is initialized to 5MB
– read the first 5MB of the file into the buffer
– the window pointer starts at the beginning of the buffer
– upon shifting, the window pointer is incremented
– when the window pointer nears the end of the 5MB buffer, (say at 4.99MB), copy the remaining 0.01MB to the beginning of the buffer, reset the window pointer to the beginning, and read an additional 4.99MB into the buffer.
– repeat

edit 2 – the actual implementation (removed)

Thank you all for many insightful response. It was hard to select a “best answer”; they were all excellent and helped with my coding.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T09:57:00+00:00Added an answer on June 13, 2026 at 9:57 am

    I use a sliding window in one of my apps (actually, several layers of sliding windows working on top of each other, but that is outside the scope of this discussion). The window uses a memory-mapped file view via CreateFileMapping() and MapViewOfFile(), then I have an an abstraction layer on top of that. I ask the abstraction layer for any range of bytes I need, and it ensures that the file mapping and file view are adjusted accordingly so those bytes are in memory. Every time a new range of bytes is requested, the file view is adjusted only if needed.

    The file view is positioned and sized on page boundaries that are even multiples of the system granularity as reported by GetSystemInfo(). Just because a scan reaches the end of a given byte range does not necessarily mean it has reached the end of a page boundary yet, so the next scan may not need to alter the file view at all, the next bytes are already in memory. If the first requested byte of a range exceeds the right-hand boundary of a mapped page, the left edge of the file view is adjusted to the left-hand boundary of the requested page and any pages to the left are unmapped. If the last requested byte in the range exceeds the right-hand boundary of the right-most mapped page, a new page is mapped and added to the file view.

    It sounds more complex than it really is to implement once you get into the coding of it:

    Creating a View Within a File

    It sounds like you are scanning bytes in fixed-sized blocks, so this approach is very fast and very efficient for that. Based on this technique, I can sequentially scan multi-GIGBYTE files from start to end fairly quickly, usually a minute or less on my slowest machine. If your files are smaller then the system granularity, or even just a few megabytes, you will hardly notice any time elapsed at all (unless your scans themselves are slow).

    Update: here is a simplified variation of what I use:

    class FileView
    {
    private:
        DWORD m_AllocGran;
        DWORD m_PageSize;
    
        HANDLE m_File;
        unsigned __int64 m_FileSize;
    
        HANDLE m_Map;
        unsigned __int64 m_MapSize;
    
        LPBYTE m_View;
        unsigned __int64 m_ViewOffset;
        DWORD m_ViewSize;
    
        void CloseMap()
        {
            CloseView();
    
            if (m_Map != NULL)
            {
                CloseHandle(m_Map);
                m_Map = NULL;
            }
            m_MapSize = 0;
        }
    
        void CloseView()
        {
            if (m_View != NULL)
            {
                UnmapViewOfFile(m_View);
                m_View = NULL;
            }
            m_ViewOffset = 0;
            m_ViewSize = 0;
        }
    
        bool EnsureMap(unsigned __int64 Size)
        {
            // do not exceed EOF or else the file on disk will grow!
            Size = min(Size, m_FileSize);
    
            if ((m_Map == NULL) ||
                (m_MapSize != Size))
            {
                // a new map is needed...
    
                CloseMap();
    
                ULARGE_INTEGER ul;
                ul.QuadPart = Size;
    
                m_Map = CreateFileMapping(m_File, NULL, PAGE_READONLY, ul.HighPart, ul.LowPart, NULL);
                if (m_Map == NULL)
                    return false;
    
                m_MapSize = Size;
            }
    
            return true;
        }
    
        bool EnsureView(unsigned __int64 Offset, DWORD Size)
        {
            if ((m_View == NULL) ||
                (Offset < m_ViewOffset) ||
                ((Offset + Size) > (m_ViewOffset + m_ViewSize)))
            {
                // the requested range is not already in view...
    
                // round down the offset to the nearest allocation boundary
                unsigned __int64 ulNewOffset = ((Offset / m_AllocGran) * m_AllocGran);
    
                // round up the size to the next page boundary
                DWORD dwNewSize = ((((Offset - ulNewOffset) + Size) + (m_PageSize-1)) & ~(m_PageSize-1));
    
                // if the new view will exceed EOF, truncate it
                unsigned __int64 ulOffsetInFile = (ulNewOffset + dwNewSize);
                if (ulOffsetInFile > m_FileSize)
                    dwNewViewSize -= (ulOffsetInFile - m_FileSize);
    
                if ((m_View == NULL) ||
                    (m_ViewOffset != ulNewOffset) ||
                    (m_ViewSize != ulNewSize))
                {
                    // a new view is needed...
    
                    CloseView();
    
                    // make sure the memory map is large enough to contain the entire view
                    if (!EnsureMap(ulNewOffset + dwNewSize))
                        return false;
    
                    ULARGE_INTEGER ul;
                    ul.QuadPart = ulNewOffset;
    
                    m_View = (LPBYTE) MapViewOfFile(m_Map, FILE_MAP_READ, ul.HighPart, ul.LowPart, dwNewSize);
                    if (m_View == NULL)
                        return false;
    
                    m_ViewOffset = ulNewOffset;
                    m_ViewSize = dwNewSize;
                }
            }
    
            return true;
        }
    
    public:
        FileView() :
            m_AllocGran(0),
            m_PageSize(0),
            m_File(INVALID_HANDLE_VALUE),
            m_FileSize(0),
            m_Map(NULL),
            m_MapSize(0),
            m_View(NULL),
            m_ViewOffset(0),
            m_ViewSize(0)
        {
            // map views need to be positioned on even multiples
            // of the system allocation granularity.  let's size
            // them on even multiples of the system page size...
    
            SYSTEM_INFO si = {0};
            if (GetSystemInfo(&si))
            {
                m_AllocGran = si.dwAllocationGranularity;
                m_PageSize = si.dwPageSize;
            }
        }
    
        ~FileView()
        {
            CloseFile();
        }
    
        bool OpenFile(LPTSTR FileName)
        {
            CloseFile();
    
            if ((m_AllocGran == 0) || (m_PageSize == 0))
                return false;
    
            HANDLE hFile = CreateFile(FileName, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL);
            if (hFile == INVALID_HANDLE_VALUE)
                return false;
    
            ULARGE_INTEGER ul;
            ul.LowPart = GetFileSize(hFile, &ul.HighPart);
            if ((ul.LowPart == INVALID_FILE_SIZE) && (GetLastError() != 0))
            {
                CloseHandle(hFile);
                return false;
            }
    
            m_File = hFile;
            m_FileSize = ul.QuadPart;
    
            return true;
        }
    
        void CloseFile()
        {
            CloseMap();
    
            if (m_File != INVALID_HANDLE_VALUE)
            {
                CloseHandle(m_File);
                m_File = INVALID_HANDLE_VALUE;
            }
            m_FileSize = 0;
        }
    
        bool AccessBytes(unsigned __int64 Offset, DWORD Size, LPBYTE *Bytes, DWORD *Available)
        {
            if (Bytes) *Bytes = NULL;
            if (Available) *Available = 0;
    
            if ((m_FileSize != 0) && (offset < m_FileSize))
            {
                // make sure the requested range is in view
                if (!EnsureView(Offset, Size))
                    return false;
    
                // near EOF, the available bytes may be less than requested
    
                DWORD dwOffsetInView = (Offset - m_ViewOffset);
    
                if (Bytes) *Bytes = &m_View[dwOffsetInView];
                if (Available) *Available = min(m_ViewSize - dwOffsetInView, Size);
            }
    
            return true;
        }
    };
    

    .

    FileView fv;
    if (fv.OpenFile(TEXT("C:\\path\\file.ext")))
    {
        LPBYTE data;
        DWORD len;
    
        unsigned __int64 offset = 0, filesize = fv.FileSize();
    
        while (offset < filesize)
        {
            if (!fv.AccessBytes(offset, some size here, &data, &len))
                break; // error
    
            if (len == 0)
                break; // unexpected EOF
    
            // use data up to len bytes as needed...
    
            offset += len;
        }
    
        fv.CloseFile();
    }
    

    This code is designed to allow random jumping anywhere in the file at any data size. Since you are reading bytes sequentially, some of the logic can be simplified as needed.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm parsing an RSS feed that has an &#8217; in it. SimpleXML turns this
I need a function that will clean a strings' special characters. I do NOT
I have a .ini file as follows: [playlist] numberofentries=2 File1=http://87.230.82.17:80 Title1=(#1 - 365/1400) Example
link Im having trouble converting the html entites into html characters, (&# 8217;) i
That's pretty much it. I'm using Nokogiri to scrape a web page what has
I have just tried to save a simple *.rtf file with some websites and
I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I've got a string that has curly quotes in it. I'd like to replace
I have a French site that I want to parse, but am running into
I want use html5's new tag to play a wav file (currently only supported

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.