This question made me search for what else can I get from a file without traversing its contents (means without inputting the contents using ifstream or getc etc).
Other than file size and number of characters, what other information can I gather? I searched fseek, I found I can use SEEK_SET, SEEK_CUR and SEEK_END, which only allow me to find the end of the file, start of the file and current pointer.
In order to make it a question, I specifically want to ask:
- Can occurrences of some character or type of character (newline etc) be counted?
- Can its contents be matched with a certain template?
- Is using these methods faster than reading the file multiple times?
And I am asking about Microsoft Windows, not Linux.
1) No, becuase searching of something in unpredicteble conditions requires thorough examing of contents. Examing is reading. Of course, you may collect some statistics before, but you need to traverse you data not less then once. You can use other applications to do this implicitly, but they also will traverse your file from very begining to the end. You may orginize your file some way to obtain necessary info with minimal amount of read-operations, but its all up to your task, and there is no general approach (Because any generiosuty comes to examing the whole source structure).
2) Also No (see above)
3) Yes. Store as much as possible (or required by task) in memory (that’s called caching). For example, use mapping (See MapViewOfFile for Windows and mmap(2) on *nix systems), this uses some in-system caching mechanism.