Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6548463
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T11:58:28+00:00 2026-05-25T11:58:28+00:00

This recent question had me thinking about optimizing a category filter. Suppose we wish

  • 0

This recent question had me thinking about optimizing a category filter.

Suppose we wish to create a database referencing a huge number of audio tracks, with their release date and a list of world locations from which the audio track is downloadable.

The requests we wish to optimize for are:

  • Give me the 10 most recent tracks downloadable from location A.
  • Give me the 10 most recent tracks downloadable from locations A or B.
  • Give me the 10 most recent tracks downloadable from locations A and B.

How would one go about structuring that database ? I have a hard time coming up with a simple solution that doesn’t require reading through all the tracks for at least one location…

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T11:58:28+00:00Added an answer on May 25, 2026 at 11:58 am

    To optimise these queries, you need to slightly de-normalise the data.

    For example, you may have a track table that contains the track’s id, name and release date, and a map_location_to_track table that describes where those tracks can be down-loaded from. To answer “10 most recent tracks for location A” you need to get ALL of the tracks for Location A from map_location_to_track, then join them to the track table to order them by release date, and pick the top 10.

    If instead all the data is in a single table, the ordering step can be avoided. For example…

    CREATE TABLE map_location_to_track (
      location_id   INT,
      track_id      INT,
      release_date  DATETIME,
      PRIMARY KEY (location_id, release_date, track_id)
    )
    
    SELECT * FROM map_location_to_track
    WHERE location_id = A
    ORDER BY release_date DESC LIMIT 10
    

    Having location_id as the first entry in the primary key ensures that the WHERE clause is simply an index seek. Then there is no requirement to re-order the data, it’s already ordered for us by the primary key, but instead just pick the 10 records at the end.

    You may indeed still join on to the track table to get the name, price, etc, but you now only have to do that for 10 records, not everything at that location.

    To solve the same query for “locations A OR B”, there are a couple of options that can perform differently depending on the RDBMS you are using.

    The first is simple, though some RDBMS don’t play nice with IN…

    SELECT track_id, release_date FROM map_location_to_track
    WHERE location_id IN (A, B)
    GROUP BY track_id, release_date
    ORDER BY release_date DESC LIMIT 10
    

    The next option is nearly identical, but still some RDBMS don’t play nice with OR logic being applied to INDEXes.

    SELECT track_id, release_date FROM map_location_to_track
    WHERE location_id = A or location_id = B
    GROUP BY track_id, release_date
    ORDER BY release_date DESC LIMIT 10
    

    In either case, the algorithm being used to rationalise the list of records down to 10 is hidden from you. It’s a matter of try it and see; the index is still available such that this CAN be performant.

    An alternative is to explicitly determine part of the approach in your SQL statement…

    SELECT
      *
    FROM
    (
      SELECT track_id, release_date FROM map_location_to_track
      WHERE location_id = A
      ORDER BY release_date DESC LIMIT 10
    
      UNION
    
      SELECT track_id, release_date FROM map_location_to_track
      WHERE location_id = B
      ORDER BY release_date DESC LIMIT 10
    )
      AS data
    ORDER BY
      release_date DESC
    LIMIT 10
    
    -- NOTE: This is a UNION and not a UNION ALL
    --       The same track can be available in both locations, but should only count once
    --       It's in place of the GROUP BY in the previous 2 examples
    

    It is still possible for an optimiser to realise that these two unioned data sets are ordered, and so make the external order by very quick. Even if not, however, ordering 20 items is pretty quick. More importantly, it’s a fixed overhead: it doesn’t matter if you have a billion tracks in each location, we’re just merging two lists of 10.

    The hardest to optimise is the AND condition, but even then the existance of the “TOP 10” constraint can help work wonders.

    Adding a HAVING clause to the IN or OR based approaches can solve this, but, again, depending on your RDBMS, may run less than optimally.

    SELECT track_id, release_date FROM map_location_to_track
    WHERE location_id = A or location_id = B
    GROUP BY track_id, release_date
    HAVING COUNT(*) = 2
    ORDER BY release_date DESC LIMIT 10
    

    The alternative is to try the “two queries” approach…

    SELECT
      location_a.*
    FROM
    (
      SELECT track_id, release_date FROM map_location_to_track
      WHERE location_id = A
    )
      AS location_a
    INNER JOIN  
    (
      SELECT track_id, release_date FROM map_location_to_track
      WHERE location_id = B
    )
      AS location_b
        ON  location_a.release_date = location_b.release_date
        AND location_a.track_id     = location_b.track_id
    ORDER BY
      location_a.release_date DESC
    LIMIT 10
    

    This time we can’t restrict the two sub-queries to just 10 records; for all we know the most recent 10 in location a don’t appear in location b at all. The primary key rescues us again though. The two data sets are orgnised by release date, the RDBMScan just start at the top record of each set and merge the two until it has 10 records, then stop.

    NOTE: Because the release_date is in the primary key, and before the track_id, one should ensure that it is used in the join.

    Depending on the RDBMS, you don’t even need the sub-queries. You may be able to just self-join the table without altering the RDBMS’ plan…

    SELECT
      location_a.*
    FROM
      map_location_to_track AS location_a
    INNER JOIN  
      map_location_to_track AS location_b
        ON  location_a.release_date = location_b.release_date
        AND location_a.track_id     = location_b.track_id
    WHERE
          location_a.location_id = A
      AND location_b.location_id = B
    ORDER BY
      location_a.release_date DESC
    LIMIT 10
    

    All in all, the combination of three things makes this pretty efficient:
    – Partially De-Normalising the data to ensure it’s in a friendly order for our needs
    – Knowing we only ever need the first 10 results
    – Knowing we’re only ever dealing with 2 locations at the most

    There are variations that can optimise to any number of records and any number of locations, but these are significantly less performant than the problem stated in this question.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

This recent question about sorting randomly using C# got me thinking about the way
In reading this recent question about an unhandled XmlException, I tried to reproduce it
I thought I had reasonable answers for this question at a recent interview, but
In a recent interview I had this question. Whats the error here? I know
In this answer to a recent question , I was advised to be wary
This is very similar to another recent question: How can I return the current
There is another recent Project Euler question but I think this is a bit
On a recent question about MVC attributes, someone asked whether using HttpPost and HttpDelete
Another recent C# interview question I had was if I knew what Boxing and
As a follow-up to my recent question about .NET Compact Framework debugging, I am

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.