I’m writing an online tax return filing application using MVC3 and EF 4.1. Part of the application requires that the taxpayer be able to upload documents associated with their return. The users will be able to come back days or weeks later and possibly upload additional documents. Prior to finally submitting their return the user is able to view a list of files that have been uploaded. I’ve written the application to save the uploaded files to a directory defined in the web.config. When I display the review page to the user I loop through the files in the directory and display it as a list.
I’m now thinking that I should be saving the files to the actual SQL Server as binary data in addition to saving them to the directory. I’m trying to avoid what if scenarios.
What if
- A staff member accidentally deletes a file from the directory.
- The file server crashes (Other agencies use the same SAN as us)
- A staff member saves other files to the same directory. The taxpayer should not see those
- Any other scenario that causes us to have to request another copy of a file from a taxpayer (Failure is not an option)
I’m concerned that saving to the SQL Server database will have dire consequences that I am not aware of since I’ve not done this before in a production environment.
There’s a really good paper by Microsoft Research called To Blob or Not To Blob.
Their conclusion after a large number of performance tests and analysis is this:
if your pictures or document are typically below 256K in size, storing them in a database
VARBINARYcolumn is more efficientif your pictures or document are typically over 1 MB in size, storing them in the filesystem is more efficient (and with SQL Server 2008’s
FILESTREAMattribute, they’re still under transactional control and part of the database)in between those two, it’s a bit of a toss-up depending on your use
If you decide to put your pictures into a SQL Server table, I would strongly recommend using a separate table for storing those pictures – do not store the employee foto in the employee table – keep them in a separate table. That way, the Employee table can stay lean and mean and very efficient, assuming you don’t always need to select the employee foto, too, as part of your queries.
For filegroups, check out Files and Filegroup Architecture for an intro. Basically, you would either create your database with a separate filegroup for large data structures right from the beginning, or add an additional filegroup later. Let’s call it “LARGE_DATA”.
Now, whenever you have a new table to create which needs to store
VARCHAR(MAX)orVARBINARY(MAX)columns, you can specify this file group for the large data:Check out the MSDN intro on filegroups, and play around with it!