I have a SQL Server 2008 database which contains data that I need to use to generate code from (actually it is a SQL script I need to generate to populate another database with a different structure, but please don’t get misled by that – this is a question about basically generating a big blob of text based on the data).
I am concerned about performance. Therefore, generally speaking, would it be more performant to
a) Generate the code in a stored procedure on the SQL Server:
Pro: The data doesn’t have to move over the network so there are less latency issues (although the completed blob of text will have to be sent over which may be larger)
Con: Manipulating the data is cumbersome (cursors) and manipulating strings in T-SQL (I would imagine) is slower than on the web server (.NET)
b) Retrieve the data I need and generate the code on the web server:
Pro: Quicker, more flexible string handling
Con: Bringing all the data back from the SQL box
For the sake of this question lets consider using data of around 100,000 rows
UPDATE:
I didn’t mention that I am aiming at generating the script from a form submit and sending the results straight back to the browser. Therefore, solutions using things like SSIS may be of limited use in this scenario
From a pure experience level, SQL Server performs string manipulations MUCH slower than code.
I’ve re-factored several programs that take data from one source, manipulate it, and put it in another, and the first, best performance gains are achieved by moving all string manipulation into code, using DataSets and System.Text.StringBuilders.
I finally found some documentation to back this up: http://msdn.microsoft.com/en-us/library/ms131075.aspx
That said, it might not hurt to try both and benchmark them and then weigh your options. In addition to performance, consider factors like readability, ease of future maintenance, etc. if the performance different isn’t that great when benchmarking, other factors may become more important.
Reading your other notes on other answers, it may be that security, rather than performance should be the deciding factor. In general, it’s a LOT easier to manipulate strings in code and sanitize any potentially untrusted user input to prevent SQL Injection, XSS, etc. Escaping strings is possible in pure T-SQL, but in code you can create Parameterized Queries based on the input, which is (according to OWASP) better preferred to escaping strings. That’s pretty much impossible in T-SQL.
From OWASP: