In a mysql MyISAM table, I have a column type mediumblob and storing captured image as blob data. I got some interesting and problematic images. Some of the images are gradually losing data.
Field type
--------------------------
image mediumblob
my.ini max allowed packet size set max_allowed_packet = 8M



this is the problem
When the C# application fetches the data from the server, this kind of images losing data of random sizes every time. I got 10-12 bad images like this in 100000+ image data.
What could be the reason of this kind of behavior? Anyone has any idea/solution how to fix/avoid this problem.
Update 1:
Reading bytes form PictureBox
MemoryStream ms = new MemoryStream();
byte[] ret = null;
try
{
picturebox.Image.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg);
byte[] Data = new byte[ms.Length];
ms.Read(Data, 0, (int)ms.Length);
ret = byteData;
ms.Close();
}
Saving the bytes array into database as medium blob data. When retrieving the data from database I am casting the reader data:
byte[] Data = (byte[])reader["Image"];
Culprit is MyISAM storage type.
We used InnoDB storage to store one million images and conducted stress test, we had proper results. Either file was retrieved correctly or it was not at all retrieved (less than 0.01%), since InnoDB is acid compliant.
When we shifted to MyISAM, failure rate increased to 20% with lossy data as well same as your case. And reason was, MyISAM uses table lock, so while write is in progress entire table is locked and in event of timeout, it does overwrite something leading to data loss.
We have now shifted everything to MS SQL, since InnoDB performs well but still it never reuses deleted file space, so InnoDB endlessly keeps on growing. MS SQL express has limit of 10gb, so we created pages of 4-8gb and we store blobs there. And we have our own custom replication to replicate files over three servers across network with same config.
Storing as files on disk is bad for many reasons, everyone keeps on saying file systems are designed for high performance and can store million files, this is not true, drives fails to perform faster when you have more than 100 thousand files. They perform well with one big file then 1000 smaller files. Currently we are storing 10 million files and storing it in db makes more sense because db does optimization over query and does good caching. You can read more at http://akashkava.com/blog/127/huge-file-storage-in-database-instead-of-file-system/
This is the exact reason why MongoDb, Hadoop, Azure Blob Store, Haystack and Amazon S3 were invented.