I have an azure app on the cloud with a sql azure database. I have a worker role which needs to do parsing+processing on a file (up to ~30 million rows) so i can’t directly use BCP or SSIS.
I’m currently using SqlBulkCopy, however this seems too slow as I’ve seen load times of up to 4-5 minutes for 400k rows.
I want to run my bulk inserts in parallel; however reading through the articles on importing data in parallel/controlling lock behaviour, it says that SqlBulkCopy requires that the table does not have clustered indexes and a tablelock (BU lock) needs to be specified. However azure tables must have a clustered index…
Is it even possible to use SqlBulkCopy in parallel on the same table in SQL Azure? If not is there another API (that I can use in code) to do this?
I don’t see how you can run any faster than using SqlBulkCopy. On our project we can import 250K rows in about 3 mins, so your rate seems about right.
I don’t think that doing it in parallel would help, even if it was technically possible. We only run 1 import at a time otherwise SQL Azure starts timing out our requests.
In fact sometimes, running a large group-by query at the same time as the import isn’t possible. SQL Azure does a lot of work to ensure quality of service, this includes timing out requests that take too long, take too many resource, etc
So doing several large bulk inserts at the same time will probably cause one to time out.