I’m considering an optimisation in a particularly heavy part of my code. It’s task is to insert statistical data into a table. This data is being hit a fair amount by other programs. Otherwise I would consider using SQL Bulk inserts etc.
So my question is…
Is it ok to try and insert some data knowing that it might (not too often) throw a SqlException for a duplicate row?
Is the performance hit of an exception worse than checking each row prior to insertion?
First, my advice is to err on the side of correctness, not speed. When you finish your project and profiling shows that you lose significant time checking that rows exist before inserting them, only then optimize it.
Second, I think there’s syntax for inserting and skipping if there are duplicates in all RDBMS, so this shouldn’t be a problem in the first place. I try to avoid exceptions as part of the normal application flow and leave them for truly exceptional cases. That is, don’t count on exceptions in the DB to work around logic in your code. Maintain as much consistency on your end (code), and let DB exceptions indicate only true bugs.