I’m trying to get some firm handle on how reliable long-term data storage is using Google Docs/Google Drive. Presumably when a file gets uploaded or automatically synced, the transfer is verified using an md5sum — I see that these are saved as meta-information according to the Google Docs API. And since the file is mirrored to multiple servers, presumably each of these transfers is also verified.
But then the file sits there for years. I don’t change it, so no syncing ever gets triggered. Does Google occasionally verify that the md5sum hasn’t changed, to protect against silent corruption of the file — and repair the file if an inconsistency is found? Or is the md5sum meta-information just a static value representing what the file looked like when first uploaded years ago?
I would not worry about this. We can’t share the specifics but data hosted on Google is checked against corruption and is also replicated multiple times.
This doesn’t prevent you from uploading corrupted data though. So you could potentially use the read-only MD5 checksum field post upload to make sure that the file that you just uploaded to Drive has the correct MD5 if data consistency is critical for you.