We work with some very large databases (300Gb – 1Tb). Tables can contain from

Question

0

Asked: May 28, 20262026-05-28T02:23:32+00:00 2026-05-28T02:23:32+00:00

We work with some very large databases (300Gb – 1Tb). Tables can contain from

0

We work with some very large databases (300Gb – 1Tb). Tables can contain from 10M to 5B records. We do some not very complex data transformation involving some with and unpivot statements. The problem is that the data log file and tempdb grows huge and eventually server stops working.

Now I’m leaning to an idea that with and even unpivot constructions are expensive in terms of resource usage and we should consider some simplifications here:

splitting into several steps with temp tables instead of using with
using union instead of unpivot

Does anybody have experience like this?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T02:23:33+00:00

Sincere thanks to everyone. Now its pretty obvious for me that it was UNPIVOT misusage. Indeed, CTEs are just views, so they don’t hurt that much unless you use them improperly.

So the basic of our problem was that our server (32 Gb RAM, 8 CPUs, 2Tb HDD) was simply unable to manage a big amount of records that UNPIVOT produced.

Let’s say we have HugeTable with fields (F1, F2, F3, F4, F5, F6). RecordCount = 1,000,000,000

We use it this way (pseudocode):

with tmp as (select unpivot HugeTable)
select * from tmp
join ...
where FX is not null
and ...

The query plan estimates that our UNPIVOT produces 6,000,000,000 records to be processed by our where clause. It becomes even worse with the fact that in reality we join some additional tables and do extra filterings. All this occurs 6 billion times. The transaction log and tempdb were still untouched – rather small. I’ve found no information that UNPIVOT/JOINS(hashjoins to be presize) uses RAM only to manage its operations but from what we experienced I understand, that our SQL Server 2008 R2 Enterprise was simply trying to fit that bulk recordset in RAM, but as we didn’t have 1Tb RAM the operating system was doing huge swapping operations.

The interesting thing here is that it may start up very quickly and process about 1,800,000,000 records for first 6 hours, but then hangs (well, it produces 100K records per 24 hours, which is not acceptable at all)

If we turn it into manual UNION ALL like this:

with tmp as (
  select F1 from HugeTable where F1 is not null
  union all select F2 from HugeTable where F2 is not null
  union all select F3 from HugeTable where F3 is not null
  union all select F4 from HugeTable where F4 is not null
  union all select F5 from HugeTable where F5 is not null
  union all select F6 from HugeTable where F6 is not null
)
select ... from tmp
join ...
where ...

the query plan showed that CTE produced about 2 billion records. So all further joins had to be done against much smaller recordset than in 1st case. This took less than 10 hours to do the job (against days in 1st case).

BTW, we use SSIS/VS2008 environment to process our data loadings.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

We work with some very large databases (300Gb – 1Tb). Tables can contain from

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply