I have a Visual Studio 2008 C++ application where I would like to insert a string to an arbitrary point in a file using std::fstream. The file may be as large as 100MB in size, so I don’t want to read it entirely in to memory, modify it, and the re-write a new file.
/// Insert some data in to a file at a given offset
/// @param file stream to insert the data
/// @param data string to insert
/// @param offset location within the file to insert the data
void InsertString( std::fstream& file, const std::string& data, size_t offset );
The method I’m considering now is to read the file in reverse moving each byte from the end out by the length of the data string, then inserting the new string.
What is the most efficient way of accomplishing this?
You’ve just stated one of the basic motivations for database formats, and a need they fulfill.
Based on that, the solution seems pretty obvious, at least to me: you need to use a database format of some sort, probably along with code that directly supports that format. Nearly any decent db format will support what you’ve said you need, so it’s mostly a matter of deciding which code base provides an interface you like.
Of course, if you need to produce (for example) a normal text file as the result, then this isn’t really a solution. For a case like this, you pretty much need to bite the bullet and live with copying a lot of data around. At least in my experience, OSes are sufficiently oriented toward reading files sequentially, that unless your modification is quite close to the end of the file, you may easily find it’s more efficient to read and write the whole file rather than copying just enough to make space for the new data.