Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 3616362
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 18, 20262026-05-18T22:29:24+00:00 2026-05-18T22:29:24+00:00

I have the following query: SELECT location, step, COUNT(*), AVG(foo), YEAR(start), MONTH(start), DAY(start) FROM

  • 0

I have the following query:

SELECT location, step, COUNT(*), AVG(foo), YEAR(start), MONTH(start), DAY(start)
FROM table WHERE jobid = 'xxx' AND start BETWEEEN '2010-01-01' AND '2010-01-08'
GROUP BY location, step, YEAR(start), MONTH(start), DAY(start)

Originally I had indexes on individual columns, such as jobid and start, but quickly realized that MySQL only really honors one index per table in a select. As such, it would use the jobid index and then do a pretty large scan to filter out by the start range.

Adding an index on (jobid, start) helped quite a bit, but the GROUP BY is still causing performance issues. I’ve read the docs on GROUP BY optimizations and understand that in order to benefit from these optimizations I need an index that contains (location, step, start), but I still have two open questions:

  1. Will the group by optimizations even work with the time functions (YEAR, MONTH, DAY, etc)? Or am I going to have to store these values as separate columns? The reason I like doing the functions is that it means I can control the time zone on a per-connection basis and get back results tailored to the end-users time zone. If I have to pre-store the year, month, and day, I’ll do it via UTC and then all my users will just get reports in UTC.

  2. Even if I can solve issue #1, can I even do this? The index (jobid, start) helped with the WHERE clause, but the GROUP BY needs a different index to be optimized (location, step, start) or, depending on the answer to #1, (location, step, year, month, day). But the problem is that those two indexes don’t share a common left-hand set of columns, so I don’t believe my WHERE and GROUP by can be compatible such that the same index gets used. So my question is: am I just hosed here?

Any other thoughts on how to achieve this would be helpful. And, just to preempt a few questions/comments that might come up:

  1. Yes, this is a time-series data set.
  2. Yes, it would benefit from something like RRDtool, but doing so would cause me to loose doing timezone-specific results.
  3. Yes, pre-calculating rollups would probably be a good idea, but I don’t need awesome performance and so I’m OK with good performance if it lets me customize the results for each user’s timezone.

With the above said, if anyone has any design suggestions on how to do something like rollups or round-robin databases and still get timezone-specific results, I’m all ears!


Update: as requested, here is some more info:

show indexes from output:

step    0   PRIMARY 1   step_id A   16  NULL    NULL        BTREE   
step    1   start   1   start   A   16  NULL    NULL        BTREE   
step    1   step    1   step    A   2   NULL    NULL        BTREE   
step    1   foo 1   foo A   16  NULL    NULL    YES BTREE   
step    1   location    1   location    A   2   NULL    NULL    YES BTREE   
step    1   jobid   1   jobid   A   2   NULL    NULL    YES BTREE   

show create table output:

CREATE TABLE `step` (
  `start` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
  `step` smallint(2) unsigned NOT NULL,
  `step_id` int(8) unsigned NOT NULL AUTO_INCREMENT,
  `location` varchar(12) DEFAULT NULL,
  `jobid` varchar(37) DEFAULT NULL,
  PRIMARY KEY (`step_id`),
  KEY `start_time` (`start`),
  KEY `step` (`step`),
  KEY `location` (`location`),
  KEY `job_id` (`jobid`)
) ENGINE=InnoDB AUTO_INCREMENT=240 DEFAULT CHARSET=utf8
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-18T22:29:24+00:00Added an answer on May 18, 2026 at 10:29 pm

    create a single composite index on jobid, start, location, step

    then group by that order first, and sort it:

    SELECT location, step, COUNT(*), AVG(foo), YEAR(start), MONTH(start), DAY(start)
    FROM table WHERE jobid = 'xxx' AND start BETWEEEN '2010-01-01' AND '2010-01-08'
    GROUP BY YEAR(start), MONTH(start), DAY(start), location, step
    ORDER BY location, step, YEAR(start), MONTH(start), DAY(start)
    

    UPDATE

    Looks like MySql cannot use the index when the YEAR,MONTH and DAY functions are used. since

    1. After removing the start from the WHERE clause, the explain still shows using filesort
    2. Adding 3 columns: y = YEAR(start), m = MONTH(start), d=DAY(start), creating a index on jobid, y, m, d, location, step and updating the WHERE ... AND y = 2010 AND m = 12 AND d BETWEEN 1 AND 08 does remove the using temporary using filesort.

    keeping 3 extra column seems like a bad idea, since the performance difference between the GROUP BY shouldn’t matter that much if it uses temporary or not.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have the following query: select column_name, count(column_name) from table group by column_name having
I have the following SQL query: select expr1, operator, expr2, count(*) as c from
I have the following (shortened query): SELECT `Statistics`.`StatisticID`, COUNT(DISTINCT `Flags`.`FlagType`) AS `FlagCount` FROM `Statistics`
I have the following query: SELECT products_categories.categoryID, name, COUNT(*) AS itemCount FROM products_categories LEFT
I have the following query: SELECT c.* FROM companies AS c JOIN users AS
I have a following SQL QUERY: SELECT articles.name, articles.price, users.zipcode FROM articles INNER JOIN
I have the following query: SELECT id, subject, date FROM mail WHERE date >
I have the following query: SELECT `pokemon_moves`.`pokemon_move_method_id`, `pokemon_moves`.`level`, `move`.`id`, `move`.`name` FROM `pokemon_moves` LEFT OUTER
HI, Using SQL server 2005 I have the following query: SELECT contact_id ,YEAR(date_created) AS
I have a select query that currently produces the following results: Description Code Price

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.