Ok so say I have 100 rows to insert and each row has about 150 columns (I know that sounds like a lot of columns, but I need to store this data in a single table). The inserts will occur at random, (ie whenever a set of users decide to upload a file containing the data), about a 20 times a month. However the database will be under continuous load processing other functions of a large enterprise application. The columns are varchars, ints, as well as a variety of other types.
Is the performance gain of wrapping these inserts in a transaction (as opposed to running them one at a time) going to be huge, minimal, or somewhere in between?
Why?
EDIT: This is for Sql Server 2005, but I’d be interested in 2000/2008 if there is something different to be said. Also I should mention that I understand the point about transactions being primarily for data-consistency, but I want to focus on performance effects.
It can be an impact actually. The point of transactions is not about how many you do, it’s about keeping the data update consistent. If you have rows that need to be inserted together and are dependent on each other, those are the records you wrap in a transaction.
Transactions are about keeping your data consistent. This should be the first thing you think about when using transactions. For example, if you have a debit (withdrawl) from your checking account, you want to make sure the credit (deposit) is also done. If either of those don’t succeed, the whole ‘transaction’ should be rolled back. Therefore, both actions MUST be wrapped in a transaction.
When doing batch inserts, break them up in to 3000 or 5000 records and cycle through the set. 3000-5000 has been a sweet number range for me for inserts; don’t go above that unless you’ve tested that the server can handle it. Also, I will put GOs in the batch at about every 3000 or 5000 records for inserts. Updates and deletes I’ll put a GO at about 1000, because they require more resources to commit.
If your doing this from C# code, then in my opinion, you should build a batch import routine instead of doing millions of inserts one at a time through coding.