Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7015647
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T22:40:47+00:00 2026-05-27T22:40:47+00:00

I have a table that holds phone calls, with the following fields: ID STARTTIME

  • 0

I have a table that holds phone calls, with the following fields:

  • ID
  • STARTTIME
  • ENDTIME
  • STATUS
  • CALL_FROM
  • CALL_TO

There are 2,9 million records loaded into a local PostgreSQL database. I added indexes on ID (unique index), starttime and endtime.

Searching on stackoverflow, I found some useful SQL and modified it to what I think logically should work. The problem is that the query runs for many hours and never returns:

SELECT T1.sid, count(*) as CountSimultaneous
FROM calls_nov T1, calls_nov T2
WHERE
     T1.StartTime between T2.StartTime and T2.EndTime
     and T1.StartTime between '2011-11-02' and '2011-11-03'
GROUP BY
     T1.sid
ORDER BY CountSimultaneous DESC;

Can someone please either suggest a way to fix the query/index so that it actually works or suggest another way to calculate concurrent calls?

EDIT:

Explain plan:

Sort  (cost=11796758237.81..11796758679.47 rows=176663 width=35)
  Sort Key: (count(*))
  ->  GroupAggregate  (cost=0.00..11796738007.56 rows=176663 width=35)
        ->  Nested Loop  (cost=0.00..11511290152.45 rows=57089217697 width=35)

Table creation script:

CREATE TABLE calls_nov (
  sid varchar,
  starttime timestamp, 
  endtime timestamp, 
  call_to varchar, 
  call_from varchar, 
  status varchar);

Index creation:

CREATE UNIQUE INDEX sid_unique_index on calls_nov (sid);

CREATE INDEX starttime_index on calls_nov (starttime);

CREATE INDEX endtime_index on calls_nov (endtime);
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T22:40:47+00:00Added an answer on May 27, 2026 at 10:40 pm

    1.) Your query did not catch all overlaps – this was fixed by the other answers, already.

    2.) The data type of your columns starttime and endtime is timestamp. So your WHERE clause is slightly wrong, too:

    BETWEEN '2011-11-02' AND '2011-11-03'
    

    This would include ‘2011-11-03 00:00’. The upper border has to be excluded.

    3.) Removed the mixed case syntax without double-quotes. Unquoted identifiers are cast to lower case automatically. To put it simple: Best don’t use mixed case identifiers at all in PostgreSQL.

    4.) Transformed the query to use explicit JOIN which is always preferable. Actually, I made it a LEFT [OUTER] JOIN, because I want to count calls that overlap with no other calls, too.

    5.) Simplified the syntax a bit to arrive at this base query:

    SELECT t1.sid, count(*) AS ct
    FROM   calls_nov t1
    LEFT   JOIN calls_nov t2 ON t1.starttime <= t2.endtime
                            AND t1.endtime >= t2.starttime
    WHERE  t1.starttime >= '2011-11-02 0:0'::timestamp
    AND    t1.starttime <  '2011-11-03 0:0'::timestamp
    GROUP  BY 1
    ORDER  BY 2 DESC;
    

    This query is extremely slow for a big table, because every row starting on ‘2011-11-02’ has to be compared to every row in the whole table, which leads to (almost) O(n²) cost.


    Faster

    We can drastically cut down the cost by pre-selecting possible candidates. Only select columns and rows you need. I do this with two CTE.

    1. Select calls starting on the day in question. -> CTE x
    2. Calculate the latest end of those calls. (subquery in CTE y)
    3. Select only calls that overlap with the total range of CTE x. -> CTE y
    4. The final query is much faster than querying the huge underlying table.
    WITH x AS (
        SELECT sid, starttime, endtime
        FROM   calls_nov
        WHERE  starttime >= '2011-11-02 0:0'
        AND    starttime <  '2011-11-03 0:0'
        ), y AS (
        SELECT starttime, endtime
        FROM   calls_nov
        WHERE  endtime >= '2011-11-02 0:0'
        AND    starttime <= (SELECT max(endtime) As max_endtime FROM x)
        )
    SELECT x.sid, count(*) AS count_overlaps
    FROM   x
    LEFT   JOIN y ON x.starttime <= y.endtime
                 AND x.endtime >= y.starttime
    GROUP  BY 1
    ORDER  BY 2 DESC;
    

    Faster yet

    I have a real life table of 350.000 rows with overlapping start / end timestamps similar to yours. I used that for a quick benchmark. PostgreSQL 8.4, scarce resources because it is a test DB. Indexes on start and end. (Index on ID column is irrelevant here.) Tested with EXPLAIN ANALYZE, best of 5.

    Total runtime: 476994.774 ms

    CTE variant:
    Total runtime: 4199.788 ms — that’s > factor 100.

    After adding a multicolumn index of the form:

    CREATE INDEX start_end_index on calls_nov (starttime, endtime);
    

    Total runtime: 4159.367 ms


    Ultimate Speed

    If that is not enough, there is a way to speed it up yet another order of magnitude. Instead of the CTEs above, materialize the temp tables and – this is the crucial point – create an index on the second one. Could look like this:

    Execute as one transaction:

    CREATE TEMP TABLE x ON COMMIT DROP AS   
        SELECT sid, starttime, endtime
        FROM   calls_nov
        WHERE  starttime >= '2011-11-02 0:0'
        AND    starttime <  '2011-11-03 0:0';
    
    CREATE TEMP TABLE y ON COMMIT DROP AS
        SELECT starttime, endtime
        FROM   calls_nov
        WHERE  endtime >= '2011-11-02 0:0'
        AND    starttime <= (SELECT max(endtime) FROM x);
    
    CREATE INDEX y_idx ON y (starttime, endtime); -- this is where the magic happens
    
    SELECT x.sid, count(*) AS ct
    FROM   x
    LEFT   JOIN y ON x.starttime <= y.endtime
                 AND x.endtime >= y.starttime
    GROUP  BY 1
    ORDER  BY 2 DESC;
    

    Read about temporary tables in the manual.


    Ultimate solution

    • Create a plpgsql function that encapsulates the magic.

    • Diagnose the typical size of your temp tables. Create them standalone and measure:

        SELECT pg_size_pretty(pg_total_relation_size('tmp_tbl'));
      
    • If they are bigger than your setting for temp_buffers then temporarily set them high enough in your function to hold both your temporary tables in RAM. It is a major speedup if you don’t have to swap to disc. (Must be first use of temp tables in session to have effect.)

    CREATE OR REPLACE FUNCTION f_call_overlaps(date)
      RETURNS TABLE (sid varchar, ct integer) AS
    $BODY$
    DECLARE
        _from timestamp := $1::timestamp;
        _to   timestamp := ($1 +1)::timestamp;
    BEGIN
    
    SET temp_buffers = 64MB'; -- example value; more RAM for temp tables;
    
    CREATE TEMP TABLE x ON COMMIT DROP AS   
        SELECT c.sid, starttime, endtime  -- avoid naming conflict with OUT param
        FROM   calls_nov c
        WHERE  starttime >= _from
        AND    starttime <  _to;
    
    CREATE TEMP TABLE y ON COMMIT DROP AS
        SELECT starttime, endtime
        FROM   calls_nov
        WHERE  endtime >= _from
        AND    starttime <= (SELECT max(endtime) FROM x);
    
    CREATE INDEX y_idx ON y (starttime, endtime);
    
    RETURN QUERY
    SELECT x.sid, count(*)::int -- AS ct
    FROM   x
    LEFT   JOIN y ON x.starttime <= y.endtime AND x.endtime >= y.starttime
    GROUP  BY 1
    ORDER  BY 2 DESC;
    
    END;
    $BODY$   LANGUAGE plpgsql;
    

    Call:

    SELECT * FROM f_call_overlaps('2011-11-02') -- just name your date
    

    Total runtime: 138.169 ms — that’s factor 3000


    What else can you do to speed it up?

    General performance optimization.

    CLUSTER calls_nov USING starttime_index; -- this also vacuums the table fully
    
    ANALYZE calls_nov;
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a table that holds availability status for workers. Here is the structure:
I have a table that holds username, updated date, and status as below: name
I have a table that holds the card, which is described by the following
I have oracle table that holds more than 30 million records, I need to
I have a table that holds rows representing votes. The fields are an id
For the question, Let's say I have a table that holds the following data
Given that I have a table that holds vehicle information and one of those
i have a table that holds my users information like username , password ,
I have a MySQL table that holds many entries with repeated IDs (for various
Hi I have a datatables based table that holds data. I need to be

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.