I have a database table with a field containing RTF (Rich text Format). I need to convert RTF to HTML in order to proper display it later in a HTML Editor.
To achieve this I developed a console application that for each table entry reads the RTF field and convert it to HTML. This step would be done just once (is a unique migration process) and there are around 1500 records affected.
Since the number of records involved is not so hight, performance will not be deeeply affected, but ignoring for the moment the data volume I would like to know which would be the best pattern for this kind of scenarios:
1) Extract data from DB<br>
2) Execute modification on that data<br>
3) Update the relative row with modified values
Considering I am using LINQ to SQL, is still acceptable execute a submit() for each modified record or would it be better to store the modified records in a data stracture (like HashTable with ID, modifiedValue) and make a single submit() for all?
Performance on DBMS is usually affected by hardware in this order:
On the software side, the bottleneck usually is at least one of these, not necessarily in this order, and effects can range from the first to the last in the above list:
Your algorithm is simple, and – assuming you have a primary key and it is just one table – you cannot gain from indexes or query plan.
You mention this is a one off thing, so I’d start putting everything into one transaction.
If your DBMS is Microsoft SQL Server 2005 or up, you could run the whole thing on the server itself using CLR Integration and eliminate the hardware boundary number 1.