If you call ReadFile once with something like 32 MB as the size, it takes noticeably longer than if you read the equivalent number of bytes with a smaller chunk size, like 32 KB.
Why?
(No, my disk is not busy.)
Edit 1:
Forgot to mention — I’m doing this with FILE_FLAG_NO_BUFFERING!
Edit 2:
Weird…
I don’t have access to my old machine anymore (PATA), but when I tested it there, it took around 2 times as long, sometimes more. On my new machine (SATA), I’m only getting a ~25% difference.
Here’s a piece of code to test:
#include <memory.h>
#include <windows.h>
#include <tchar.h>
#include <stdio.h>
int main()
{
HANDLE hFile = CreateFile(_T("\\\\.\\C:"), GENERIC_READ,
FILE_SHARE_READ | FILE_SHARE_WRITE, NULL,
OPEN_EXISTING, FILE_FLAG_NO_BUFFERING /*(redundant)*/, NULL);
__try
{
const size_t chunkSize = 64 * 1024;
const size_t bufferSize = 32 * 1024 * 1024;
void *pBuffer = malloc(bufferSize);
DWORD start = GetTickCount();
ULONGLONG totalRead = 0;
OVERLAPPED overlapped = { 0 };
DWORD nr = 0;
ReadFile(hFile, pBuffer, bufferSize, &nr, &overlapped);
totalRead += nr;
_tprintf(_T("Large read: %d for %d bytes\n"),
GetTickCount() - start, totalRead);
totalRead = 0;
start = GetTickCount();
overlapped.Offset = 0;
for (size_t j = 0; j < bufferSize / chunkSize; j++)
{
DWORD nr = 0;
ReadFile(hFile, pBuffer, chunkSize, &nr, &overlapped);
totalRead += nr;
overlapped.Offset += chunkSize;
}
_tprintf(_T("Small reads: %d for %d bytes\n"),
GetTickCount() - start, totalRead);
fflush(stdout);
}
__finally { CloseHandle(hFile); }
return 0;
}
Result:
Large read: 1076 for 67108864 bytes
Small reads: 842 for 67108864 bytes
Any ideas?
Your test is including the time it take to read in file metadata, specifically, the mapping of file data to disk. If you close the file handle and re-open it, you should get similar timings for each. I tested this locally to make sure.
The effect is probably more severe with heavy fragmentation, as you have to read in more file to disk mappings.
EDIT: To be clear, I ran this change locally, and saw nearly identical times with large and small reads. Reusing the same file handle, I saw similar timings from the original question.