There is a stored procedure:
CREATE PROCEDURE [dbo].[TestProc]
AS
BEGIN
SET NOCOUNT ON;
create table #thistable (rid char(32))
insert into #thistable(rid)
select A0RID from tblCdbA0 with (nolock)
END
When the procedure is executed alone it takes 400-500 ms, but when 10 threads are executing the same procedure in parallel, then the first thread finishes in 1300 ms, the last – in 6000 ms and average is 4800 ms.
As you can see there is no locking where threads wait when other finish to execute. Moreover, server CPU is loaded at less then 100%, i.e. there are enough resources in order to execute them at the same time. How could that be?
EDIT: Found a good article about concurrent inserts:
Resolving PAGELATCH Contention on Highly Concurrent INSERT Workloads
To start with, CPU is not the only resource in a database. The query you posted inserts into a #temp table in tempdb, which would need resources like:
So if you have 400-500 ms for one thread and 10 threads finish in 6000 ms, that doesn’t surprise me. You request 10 times more work (ie. 10 times more IO to write those #temp tables onto disk), so 4000-5000 ms is expected. The extra 1000 ms could be from contention (threads competing for same resource).
Ultimately, you need to measure where is the time spent, see SQL Server 2005 Waits and Queues for a good methodology how to analyze the issue.