What is the best way to perform bulk inserts into an MS Access database from .NET? Using ADO.NET, it is taking way over an hour to write out a large dataset.
Note that my original post, before I “refactored” it, had both the question and answer in the question part. I took Igor Turman’s suggestion and re-wrote it in two parts – the question above and followed by my answer.
I found that using DAO in a specific manner is roughly 30 times faster than using ADO.NET. I am sharing the code and results in this answer. As background, in the below, the test is to write out 100 000 records of a table with 20 columns.
A summary of the technique and times – from best to worse:
DAO.Field‘s to refer to the table columnsAs background, occasionally I need to perform analysis of reasonably large amounts of data, and I find that Access is the best platform. The analysis involves many queries, and often a lot of VBA code.
For various reasons, I wanted to use C# instead of VBA. The typical way is to use OleDB to connect to Access. I used an
OleDbDataReaderto grab millions of records, and it worked quite well. But when outputting results to a table, it took a long, long time. Over an hour.First, let’s discuss the two typical ways to write records to Access from C#. Both ways involve OleDB and ADO.NET. The first is to generate INSERT statements one at time, and execute them, taking 79 seconds for the 100 000 records. The code is:
Note that I found no method in Access that allows a bulk insert.
I had then thought that maybe using a data table with a data adapter would be prove useful. Especially since I thought that I could do batch inserts using the
UpdateBatchSizeproperty of a data adapter. However, apparently only SQL Server and Oracle support that, and Access does not. And it took the longest time of 86 seconds. The code I used was:Then I tried non-standard ways. First, I wrote out to a text file, and then used Automation to import that in. This was fast – 2.8 seconds – and tied for first place. But I consider this fragile for a number of reasons: Outputing date fields is tricky. I had to format them specially (
someDate.ToString("yyyy-MM-dd HH:mm")), and then set up a special “import specification” that codes in this format. The import specification also had to have the “quote” delimiter set right. In the example below, with only integer fields, there was no need for an import specification.Text files are also fragile for “internationalization” where there is a use of comma’s for decimal separators, different date formats, possible the use of unicode.
Notice that the first record contains the field names so that the column order isn’t dependent on the table, and that we used Automation to do the actual import of the text file.
Finally, I tried DAO. Lots of sites out there give huge warnings about using DAO. However, it turns out that it is simply the best way to interact between Access and .NET, especially when you need to write out large number of records. Also, it gives access to all the properties of a table. I read somewhere that it’s easiest to program transactions using DAO instead of ADO.NET.
Notice that there are several lines of code that are commented. They will be explained soon.
In this code, we created DAO.Field variables for each column (
myFields[k]) and then used them. It took 2.8 seconds. Alternatively, one could directly access those fields as found in the commented liners.Fields["Field" + (k + 1).ToString()].Value = i + k;which increased the time to 17 seconds. Wrapping the code in a transaction (see the commented lines) dropped that to 14 seconds. Using an integer indexrs.Fields[k].Value = i + k;droppped that to 11 seconds. Using the DAO.Field (myFields[k]) and a transaction actually took longer, increasing the time to 3.1 seconds.Lastly, for completeness, all of this code was in a simple static class, and the
usingstatements are: