Our primary database server is an 8 core box with 8GB of RAM. The CPU is a Xeon E7330 @ 2.4GHz. It runs Windows Server 2003 R2 (x64 edition) and SQL Server 2005
I wanted to do some testing so I set up SQL Server 2005 on another brand-new server which is an 8 core box with 4 GB of RAM. It has a Xeon X5460 @ 3.16GHz and runs Windows Server 2003 R2 Standard. I Installed SQL Server 2005 out of the box and restored a backup of the primary database on to it, and did an UPDATE STATISTICS on all the tables.
The process I was testing executes the same stored proc many times. I was astounded to find from the profiler that this proc which executes with duration=0 or 1 on the primary server, was consistently executing with durations in excess of 130. This essentially makes the secondary server useless for testing, because it’s just too slow.
No other apps run on either of these two boxes, just SQL server. And unlike the primary database server, the test server only had me accessing it.
I can’t believe the difference in spec between these two machines explains this colossal difference in performance. Can anybody suggest any settings I may need to change?
Updates in answers to questions:
- Second server is 32 bit Windows
- I’m inquiring now about the disk arrays and how comparable they are
- On the primary server, the data and logs are on the same drive (!) and it works fine
-
Looking in task manager on the test server, the CPU is running at like 10%, only one core even showing activity
-
Task manager on the test server (4GB RAM) shows “PF Usage 2.01GB” with SQL Server running. On the primary server (8GB RAM) it shows “PF Usage 6.67GB”. How would I make SQL Server on the test box use more of the RAM? Maybe that would make a difference
Another update:
The primary server has a RAID-5 with 15,000 RPM drives. The test box has a RAID-5 with 10,000 RPM drives.
32 bit OS means 2 GB Virtual Address Space for your processes. Standard edition OS mean no AWE extensions either. So your test machine will be severely RAM deprived compared with the production one. Your buffer pool will suffer from premature eviction of the pages, your execution plans will not have the option to choose hash-joins for a lot of queries and so on and so forth. I doubt this explains the entire difference, I’m sure there must be something more at play. You say only 10 CPU usage during the query, is your MAXDOP setting 1 by any chance on the test server? Have you compared the output of sp_configure on the two machines? (make sure you enable ‘advanced options’ too).
Can you run the same problem query on the two machines, from a SSMS query window, with
SET STATISTICS IO ONandSET STATISTICS TIME ON? Run it 2-3 times on each and write down the results. Does it show the same number of logical reads but vastly different number of physical reads? This would point to the RAM being insufficient to cache the needed pages. IS the number of logical reads very different? It probably means you get a bad execution plan on test.Is the query write intensive by any chance? If so did you pre-grow the test database or is your execution blocked by log growth and database growth events?
There are plenty of places to look at to narrow down the issue, like SQL performance counters,
sys.dm_os_wait_stats, check thesys.dm_exec_requestswait_typeandwait_resource.