Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8158849
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 6, 20262026-06-06T17:44:40+00:00 2026-06-06T17:44:40+00:00

I have two tables: servers and stats servers has a column called id that

  • 0

I have two tables: “servers” and “stats”

servers has a column called “id” that auto-increments.
stats has a column called “server” that corresponds to a row in the servers table, a column called “time” that represents the time it was added, and a column called “votes” that I would like to get the average of.

I would like to fetch all the servers (SELECT * FROM servers) along with the average votes of the 24 most recent rows that correspond to each server. I believe this is a “greatest-n-per-group” question.

This is what I tried to do, but it gave me 24 rows total, not 24 rows per group:

SELECT servers.*,
       IFNULL(AVG(stats.votes), 0) AS avgvotes
FROM servers
LEFT OUTER JOIN
  (SELECT server,
          votes
   FROM stats
   GROUP BY server
   ORDER BY time DESC LIMIT 24) AS stats ON servers.id = stats.server
GROUP BY servers.id

Like I said, I would like to get the 24 most recent rows for each server, not 24 most recent rows total.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-06T17:44:42+00:00Added an answer on June 6, 2026 at 5:44 pm

    This is another approach.

    This query is going to suffer the same performance problems as other queries here that return correct results, because the execution plan for this query is going to require a SORT operation on EVERY row in the stats table. Since there is no predicate (restriction) on the time column, EVERY row in the stats table will be considered. For a REALLY large stats table, this is going to blow out all available temporary space before it dies a horrible death. (More notes on performance below.)

    SELECT r.*
         , IFNULL(s.avg_votes,0)
      FROM servers r
      LEFT 
      JOIN ( SELECT t.server
                  , AVG(t.votes) AS avg_votes
               FROM ( SELECT CASE WHEN u.server = @last_server 
                               THEN @i := @i + 1
                               ELSE @i := 1 
                             END AS i
                           , @last_server := u.server AS `server`
                           , u.votes AS votes
                        FROM (SELECT @i := 0, @last_server := NULL) i
                        JOIN ( SELECT v.server, v.votes
                                 FROM stats v
                                ORDER BY v.server DESC, v.time DESC
                             ) u
                    ) t
              WHERE t.i <= 24
              GROUP BY t.server
           ) s
        ON s.server = r.id
    

    What this query is doing is sorting the stats table, by server and by descending order on the time column. (Inline view aliased as u.)

    With the sorted result set, we assign a row numbers 1,2,3, etc. to each row for each server. (Inline view aliased as t.)

    With that result set, we filter out any rows with a rownumber > 24, and we calculate an average of the votes column for the “latest” 24 rows for each server. (Inline view aliased as s.)

    As a final step, we join that to the servers table, to return the requested resultset.


    NOTE:

    The execution plan for this query will be COSTLY for a large number of rows in the stats table.

    To improve performance, there are several approaches we could take.

    The simplest might to be include in the query a predicate the EXCLUDES a significant number of rows from the stats table (e.g. rows with time values over 2 days old, or over 2 weeks old). That would significantly reduce the number of rows that need to be sorted, to determine the “latest” 24 rows.

    Also, with an index on stats(server,time), it’s also possible that MySQL could do a relatively efficient “reverse scan” on the index, avoiding a sort operation.

    We could also consider implementing an index on the stats table on (server,"reverse_time"). Since MySQL doesn’t yet support descending indexes, the implementation would really be a regular (ascending) index on an a derived rtime value (a “reverse time” expression that is ascending for descending values of time (for example, -1*UNIX_TIMESTAMP(my_timestamp) or -1*TIMESTAMPDIFF('1970-01-01',my_datetime).

    Another approach to improve performance would be to keep a shadow table containing the most recent 24 rows for each server. That would be simplest to implement if we can guarantee that “latest rows” won’t be deleted from the stats table. We could maintain that table with a trigger. Basically, whenever a row is inserted into the stats table, we check if the time on the new rows is later than the earliest time stored for the server in the shadow table, if it is, we replace the earliest row in the shadow table with the new row, being sure to keep no more than 24 rows in the shadow table for each server.

    And, yet another approach is to write a procedure or function that gets the result. The approach here would be to loop through each server, and run a separate query against the stats table to get the average votes for the latest 24 rows, and gather all of those results together. (That approach mighty really be more of a workaround to avoiding a sort on huge temporary set, just to enable the resultset to be returned, not necessarily making the return of the resultset blazingly fast.)

    The bottom line for performance of this type of query on a LARGE table is restricting the number of rows considered by the query AND avoiding a sort operation on a large set. That’s how we get a query like this to perform.


    ADDENDUM

    To get a “reverse index scan” operation (to get the rows from stats ordered using an index WITHOUT a filesort operation), I had to specify DESCENDING on both expressions in the ORDER BY clause. The query above previously had ORDER BY server ASC, time DESC, and MySQL always wanted to do a filesort, even specifying the FORCE INDEX FOR ORDER BY (stats_ix1) hint.

    If the requirement is to return an ‘average votes’ for a server only if there are at least 24 associated rows in the stats table, then we can make a more efficient query, even if it is a bit more messy. (Most of the messiness in the nested IF() functions is to deal with NULL values, which do not get included in the average. It can be much less messy if we have a guarantee that votes is NOT NULL, or if we exclude any rows where votes is NULL.)

    SELECT r.*
         , IFNULL(s.avg_votes,0)
      FROM servers r
      LEFT 
      JOIN ( SELECT t.server
                  , t.tot/NULLIF(t.cnt,0) AS avg_votes
               FROM ( SELECT IF(v.server = @last_server, @num := @num + 1, @num := 1) AS num
                           , @cnt := IF(v.server = @last_server,IF(@num <= 24, @cnt := @cnt + IF(v.votes IS NULL,0,1),@cnt := 0),@cnt := IF(v.votes IS NULL,0,1)) AS cnt
                           , @tot := IF(v.server = @last_server,IF(@num <= 24, @tot := @tot + IFNULL(v.votes,0)      ,@tot := 0),@tot := IFNULL(v.votes,0)      ) AS tot
                           , @last_server := v.server AS SERVER
                        -- , v.time
                        -- , v.votes
                        -- , @tot/NULLIF(@cnt,0) AS avg_sofar
                        FROM (SELECT @last_server := NULL, @num:= 0, @cnt := 0, @tot := 0) u
                        JOIN stats v FORCE INDEX FOR ORDER BY (stats_ix1)
                       ORDER BY v.server DESC, v.time DESC
                    ) t
              WHERE t.num = 24
           ) s
        ON s.server = r.id
    

    With a covering index on stats(server,time,votes), the EXPLAIN showed MySQL avoided a filesort operation, so it must have used a “reverse index scan” to return the rows in order. Absent the covering index, and index on ‘(server,time), MySQL used the index if I included an index hint, with theFORCE INDEX FOR ORDER BY (stats_ix1)` hint, MySQL avoided a filesort as well. (But since my table had less than 100 rows, I don’t think MySQL put much emphasis on avoiding a filesort operation.)

    The time, votes, and avg_sofar expressions are commented out (in the inline view aliased as t); they aren’t needed, but they are for debugging.

    The way that query stands, it needs at least 24 rows in stats for each server, in order to return an average. (That may be acceptable.) But I was thinking that in general, we could return a running total, total so far (tot) and a running count (cnt).

    (If we replace the WHERE t.num = 24 with WHERE t.num <= 24, we can see the running average in action.)

    To return the average where there aren’t at least 24 rows in stats, that’s really a matter of identifying the row (for each server) with the maximum value of num that is <= 24.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

In SQL Server 2008 I have two tables. One table for users: id_user -
I have a sql server 2005 table called ZipCode, which has all the US
I have two SQL Server tables with Primary Keys (PK) and a Foreign Key
We have two tables in a SQL Server 2005 database, A and B.There is
In SQL Server 2008: I have two tables, dtlScheme and dtlRenewal, with a one
Good Afternoon All, I have two tables in my SQL Server 2005 DB, Main
I have two identical SQL Server tables ( SOURCE and DESTINATION ) with lots
I have SQL Server 2008 Ent and OLTP database with two big tables. How
i have two tables products and reviews each product has several reviews linked by
I have two tables, testInstance and bugzilla that are associated by a third one,

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.