Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 810297
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T00:51:40+00:00 2026-05-15T00:51:40+00:00

A query (see below) that extracts climate data from weather stations within a given

  • 0

A query (see below) that extracts climate data from weather stations within a given radius of a city using the dates for which those weather stations actually have data. The query uses the table’s only index, rather effectively:

CREATE UNIQUE INDEX measurement_001_stc_idx
  ON climate.measurement_001
  USING btree
  (station_id, taken, category_id);

Reducing the server’s configuration value for random_page_cost from 2.0 to 1.1 had a massive performance improvement for the given range (nearly an order of magnitude) because it suggested to PostgreSQL that it should use the index. While the results now return in 5 seconds (down from ~85 seconds), problematic lines remain. Bumping the query’s end date by a single year causes a full table scan:

sc.taken_start >= '1900-01-01'::date AND
sc.taken_end <= '1997-12-31'::date AND

How do I persuade PostgreSQL to use the indexes regardless of years between the two dates? (A full table scan against 43 million rows is probably not the best plan.) Find the EXPLAIN ANALYSE results below the query.

Thank you!

Query

  SELECT
    extract(YEAR FROM m.taken) AS year,
    avg(m.amount) AS amount
  FROM
    climate.city c,
    climate.station s,
    climate.station_category sc,
    climate.measurement m
  WHERE
    c.id = 5182 AND
    earth_distance(
      ll_to_earth(c.latitude_decimal,c.longitude_decimal),
      ll_to_earth(s.latitude_decimal,s.longitude_decimal)) / 1000 <= 30 AND
    s.elevation BETWEEN 0 AND 3000 AND
    s.applicable = TRUE AND
    sc.station_id = s.id AND
    sc.category_id = 1 AND
    sc.taken_start >= '1900-01-01'::date AND
    sc.taken_end <= '1996-12-31'::date AND
    m.station_id = s.id AND
    m.taken BETWEEN sc.taken_start AND sc.taken_end AND
    m.category_id = sc.category_id
  GROUP BY
    extract(YEAR FROM m.taken)
  ORDER BY
    extract(YEAR FROM m.taken)

1900 to 1996: Index

"Sort  (cost=1348597.71..1348598.21 rows=200 width=12) (actual time=2268.929..2268.935 rows=92 loops=1)"
"  Sort Key: (date_part('year'::text, (m.taken)::timestamp without time zone))"
"  Sort Method:  quicksort  Memory: 32kB"
"  ->  HashAggregate  (cost=1348586.56..1348590.06 rows=200 width=12) (actual time=2268.829..2268.886 rows=92 loops=1)"
"        ->  Nested Loop  (cost=0.00..1344864.01 rows=744510 width=12) (actual time=0.807..2084.206 rows=134893 loops=1)"
"              Join Filter: ((m.taken >= sc.taken_start) AND (m.taken <= sc.taken_end) AND (sc.station_id = m.station_id))"
"              ->  Nested Loop  (cost=0.00..12755.07 rows=1220 width=18) (actual time=0.502..521.937 rows=23 loops=1)"
"                    Join Filter: ((sec_to_gc(cube_distance((ll_to_earth((c.latitude_decimal)::double precision, (c.longitude_decimal)::double precision))::cube, (ll_to_earth((s.latitude_decimal)::double precision, (s.longitude_decimal)::double precision))::cube)) / 1000::double precision) <= 30::double precision)"
"                    ->  Index Scan using city_pkey1 on city c  (cost=0.00..2.47 rows=1 width=16) (actual time=0.014..0.015 rows=1 loops=1)"
"                          Index Cond: (id = 5182)"
"                    ->  Nested Loop  (cost=0.00..9907.73 rows=3659 width=34) (actual time=0.014..28.937 rows=3458 loops=1)"
"                          ->  Seq Scan on station_category sc  (cost=0.00..970.20 rows=3659 width=14) (actual time=0.008..10.947 rows=3458 loops=1)"
"                                Filter: ((taken_start >= '1900-01-01'::date) AND (taken_end <= '1996-12-31'::date) AND (category_id = 1))"
"                          ->  Index Scan using station_pkey1 on station s  (cost=0.00..2.43 rows=1 width=20) (actual time=0.004..0.004 rows=1 loops=3458)"
"                                Index Cond: (s.id = sc.station_id)"
"                                Filter: (s.applicable AND (s.elevation >= 0) AND (s.elevation <= 3000))"
"              ->  Append  (cost=0.00..1072.27 rows=947 width=18) (actual time=6.996..63.199 rows=5865 loops=23)"
"                    ->  Seq Scan on measurement m  (cost=0.00..25.00 rows=6 width=22) (actual time=0.000..0.000 rows=0 loops=23)"
"                          Filter: (m.category_id = 1)"
"                    ->  Bitmap Heap Scan on measurement_001 m  (cost=20.79..1047.27 rows=941 width=18) (actual time=6.995..62.390 rows=5865 loops=23)"
"                          Recheck Cond: ((m.station_id = sc.station_id) AND (m.taken >= sc.taken_start) AND (m.taken <= sc.taken_end) AND (m.category_id = 1))"
"                          ->  Bitmap Index Scan on measurement_001_stc_idx  (cost=0.00..20.55 rows=941 width=0) (actual time=5.775..5.775 rows=5865 loops=23)"
"                                Index Cond: ((m.station_id = sc.station_id) AND (m.taken >= sc.taken_start) AND (m.taken <= sc.taken_end) AND (m.category_id = 1))"
"Total runtime: 2269.264 ms"

1900 to 1997: Full Table Scan

"Sort  (cost=1370192.26..1370192.76 rows=200 width=12) (actual time=86165.797..86165.809 rows=94 loops=1)"
"  Sort Key: (date_part('year'::text, (m.taken)::timestamp without time zone))"
"  Sort Method:  quicksort  Memory: 32kB"
"  ->  HashAggregate  (cost=1370181.12..1370184.62 rows=200 width=12) (actual time=86165.654..86165.736 rows=94 loops=1)"
"        ->  Hash Join  (cost=4293.60..1366355.81 rows=765061 width=12) (actual time=534.786..85920.007 rows=139721 loops=1)"
"              Hash Cond: (m.station_id = sc.station_id)"
"              Join Filter: ((m.taken >= sc.taken_start) AND (m.taken <= sc.taken_end))"
"              ->  Append  (cost=0.00..867005.80 rows=43670150 width=18) (actual time=0.009..79202.329 rows=43670079 loops=1)"
"                    ->  Seq Scan on measurement m  (cost=0.00..25.00 rows=6 width=22) (actual time=0.001..0.001 rows=0 loops=1)"
"                          Filter: (category_id = 1)"
"                    ->  Seq Scan on measurement_001 m  (cost=0.00..866980.80 rows=43670144 width=18) (actual time=0.008..73312.008 rows=43670079 loops=1)"
"                          Filter: (category_id = 1)"
"              ->  Hash  (cost=4277.93..4277.93 rows=1253 width=18) (actual time=534.704..534.704 rows=25 loops=1)"
"                    ->  Nested Loop  (cost=847.87..4277.93 rows=1253 width=18) (actual time=415.837..534.682 rows=25 loops=1)"
"                          Join Filter: ((sec_to_gc(cube_distance((ll_to_earth((c.latitude_decimal)::double precision, (c.longitude_decimal)::double precision))::cube, (ll_to_earth((s.latitude_decimal)::double precision, (s.longitude_decimal)::double precision))::cube)) / 1000::double precision) <= 30::double precision)"
"                          ->  Index Scan using city_pkey1 on city c  (cost=0.00..2.47 rows=1 width=16) (actual time=0.012..0.014 rows=1 loops=1)"
"                                Index Cond: (id = 5182)"
"                          ->  Hash Join  (cost=847.87..1352.07 rows=3760 width=34) (actual time=6.427..35.107 rows=3552 loops=1)"
"                                Hash Cond: (s.id = sc.station_id)"
"                                ->  Seq Scan on station s  (cost=0.00..367.25 rows=7948 width=20) (actual time=0.004..23.529 rows=7949 loops=1)"
"                                      Filter: (applicable AND (elevation >= 0) AND (elevation <= 3000))"
"                                ->  Hash  (cost=800.87..800.87 rows=3760 width=14) (actual time=6.416..6.416 rows=3552 loops=1)"
"                                      ->  Bitmap Heap Scan on station_category sc  (cost=430.29..800.87 rows=3760 width=14) (actual time=2.316..5.353 rows=3552 loops=1)"
"                                            Recheck Cond: (category_id = 1)"
"                                            Filter: ((taken_start >= '1900-01-01'::date) AND (taken_end <= '1997-12-31'::date))"
"                                            ->  Bitmap Index Scan on station_category_station_category_idx  (cost=0.00..429.35 rows=6376 width=0) (actual time=2.268..2.268 rows=6339 loops=1)"
"                                                  Index Cond: (category_id = 1)"
"Total runtime: 86165.936 ms"
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T00:51:41+00:00Added an answer on May 15, 2026 at 12:51 am

    The problem was that the station ID was not sequentially distributed in the measurement tables. The solution:

    CREATE UNIQUE INDEX measurement_001_stc_index
      ON climate.measurement_001
      USING btree
      (station_id, taken, category_id);
    ALTER TABLE climate.measurement_001 CLUSTER ON measurement_001_stc_index;
    

    By forcing a CLUSTER on the columns, the station IDs were aligned physically on disk with the table’s natural order. This gave a performance increase of an order of magnitude.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a SELECT statement (see below) that gets a few items from the
How to see this format hh:mm from hh:mm:ss in SQL Server 2008 query ?
I am trying to concatenate results of some string data in a query using
I am using Symfony 2.0.10 with Doctrine 2.1 and have rather simple query (see
I suppose that I work with these MySQL control structures (see below) in absolutely
The following query returns data right away: SELECT time, value from data order by
I have a Pl/SQL query that pull all records from the 3 tables. That
Say I have a query like the one below: SELECT t1.id, t1.Name FROM Table1
My main issue is the number of times I query the database (see below).
I have problem with session and query execution,please see code below. class A implements

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.