Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 403921
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 12, 20262026-05-12T17:15:39+00:00 2026-05-12T17:15:39+00:00

How to speed up select count(*) with group by ? It’s too slow and

  • 0

How to speed up select count(*) with group by?
It’s too slow and is used very frequently.
I have a big trouble using select count(*) and group by with a table having more than 3,000,000 rows.

select object_title,count(*) as hot_num   
from  relations 
where relation_title='XXXX'   
group by object_title  

relation_title, object_title is varchar.
where relation_title=’XXXX’, which returns more than 1,000,000 rows, lead to the indexes on object_title could not work well.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-12T17:15:40+00:00Added an answer on May 12, 2026 at 5:15 pm

    Here are several things I’d try, in order of increasing difficulty:

    (easier) – Make sure you have the right covering index

    CREATE INDEX ix_temp ON relations (relation_title, object_title);
    

    This should maximize perf given your existing schema, since (unless your version of mySQL’s optimizer is really dumb!) it will minimize the amount of I/Os needed to satisfy your query (unlike if the index is in the reverse order where the whole index must be scanned) and it will cover the query so you won’t have to touch the clustered index.

    (a little harder) – make sure your varchar fields are as small as possible

    One of the perf challenges with varchar indexes on MySQL is that, when processing a query, the full declared size of the field will be pulled into RAM. So if you have a varchar(256) but are only using 4 chars, you’re still paying the 256-byte RAM usage while the query is being processed. Ouch! So if you can shrink your varchar limits easily, this should speed up your queries.

    (harder) – Normalize

    30% of your rows having a single string value is a clear cry for normalizing into another table so you’re not duplicating strings millions of times. Consider normalizing into three tables and using integer IDs to join them.

    In some cases, you can normalize under the covers and hide the normalization with views which match the name of the current table… then you only need to make your INSERT/UPDATE/DELETE queries aware of the normalization but can leave your SELECTs alone.

    (hardest) – Hash your string columns and index the hashes

    If normalizing means changing too much code, but you can change your schema a little bit, you may want to consider creating 128-bit hashes for your string columns (using the MD5 function). In this case (unlike normalization) you don’t have to change all your queries, only the INSERTs and some of the SELECTs. Anyway, you’ll want to hash your string fields, and then create an index on the hashes, e.g.

    CREATE INDEX ix_temp ON relations (relation_title_hash, object_title_hash);
    

    Note that you’ll need to play around with the SELECT to make sure you are doing the computation via the hash index and not pulling in the clustered index (required to resolve the actual text value of object_title in order to satisfy the query).

    Also, if relation_title has a small varchar size but object title has a long size, then you can potentially hash only object_title and create the index on (relation_title, object_title_hash).

    Note that this solution only helps if one or both of these fields is very long relative to the size of the hashes.

    Also note that there are interesting case-sensitivity/collation impacts from hashing, since the hash of a lowercase string is not the same as a hash of an uppercase one. So you’ll need to make sure you apply canonicalization to the strings before hashing them– in otherwords, only hash lowercase if you’re in a case-insensitive DB. You also may want to trim spaces from the beginning or end, depending on how your DB handles leading/trailing spaces.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a MySQL query that goes as follows SELECT count(`clicks`.`user_id`) as total, `users`.`fullname`
I have a T-SQL that works below: SELECT WP_VTDID AS UTIL_VTDID, (SELECT COUNT(WP_ENGINE) FROM
Here I have a query like below: SELECT field FROM table WHERE value IN
I have the following SQL query that performs horribly due to the select count(1)
I need to speed up my coding, too much work, so I need be
I'm interested in speed, not good looking code, that is why I'm using array
I have the following UPDATE scenario: UPDATE destTable d SET d.test_count = ( SELECT
Im having a very strange problem, i have a complicated view that returns incorrect
I have an elementary query that is taking too long to execute even on
SELECT count(*) c FROM full_view WHERE verified > ( DATE (NOW()) - INTERVAL 30

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.