I have a very large web forum application (about 20 million posts since 2001)

Question

0

Asked: June 10, 20262026-06-10T07:09:01+00:00 2026-06-10T07:09:01+00:00

I have a very large web forum application (about 20 million posts since 2001)

0

I have a very large web forum application (about 20 million posts since 2001) running from a SQL Server 2012 database. The data files are about 40GB in size.

I added indexes to the tables for appropriate fields, however this query (which reveals the date range of posts in each forum) takes about 40 minutes to run:

SELECT
    T2.ForumId,
    Forums.Title,
    T2.ForumThreads,
    T2.ForumPosts,
    T2.ForumStart,
    T2.ForumStop

FROM
    Forums
    INNER JOIN (

    SELECT
        Min(ThreadStart) As ForumStart,
        Max(ThreadStop) As ForumStop,
        Count(*) As ForumThreads,
        Sum(ThreadPosts) As ForumPosts,
        Threads.ForumId
    FROM
        Threads
        INNER JOIN (

            SELECT
                Min(Posts.DateTime) As ThreadStart,
                Max(Posts.DateTime) As ThreadStop,
                Count(*) As ThreadPosts,
                Posts.ThreadId
            FROM
                Posts
            GROUP BY
                Posts.ThreadId

        ) As P2 ON Threads.ThreadId = P2.ThreadId

    GROUP BY
        Threads.ForumId

) AS T2 ON T2.ForumId = Forums.ForumId

How could I speed it up?

UPDATE:

This is the Estimated Execution Plan, from right-to-left:

[Path 1]

Clustered Index Scan (Clustered) [Posts].[PK_Posts], Cost: 98%
Hash Match (Partial Aggregate), Cost: 2%
Parallelism (Repartition Streams), Cost: 0%
Hash Match (Aggregate), Cost 0%
Compute Scalar, Cost: 0%
Bitmap (Bitmap Create), Cost: 0%

[Path 2]

Index Scan (NonClustered) [Threads].[IX_ForumId], Cost: 0%
Parallelism (Repartition Streams), Cost: 0%

[Path 1 and 2 converge into Path 3]

Hash Match (Inner Join), Cost: 0%
Hash Match (Partial Agregate), Cost: 0%
Parallelism (Repartition Streams), Cost: 0%
Sort, Cost: 0%
Stream Aggregate (Aggregate), Cost: 0%
Compute Scalar, Cost: 0%

[Path 4]
Clustered Index Seek (Clustered) [Forums].[PK_Forums], Cost: 0%

[Path 3 and 4 converge into Path 5]

Nested Loops (Inner Join), Cost: 0%
Paralleism (Gather Streams), Cost: 0%
SELECT, Cost: 0%

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T07:09:03+00:00

Editorial Team

2026-06-10T07:09:03+00:00Added an answer on June 10, 2026 at 7:09 am

I added some more indexes to the database and it sped things up considerably. Execution time is now about 20 seconds (!!). I’ll admit that a lot of the added indexes were guesswork (or just adding them randomly).

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a very large web forum application (about 20 million posts since 2001)

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply