Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8932907
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 15, 20262026-06-15T09:24:45+00:00 2026-06-15T09:24:45+00:00

I have the following query: select t.Chunk as LeftChunk, t.ChunkHash as LeftChunkHash, q.Chunk as

  • 0

I have the following query:

select 
    t.Chunk as LeftChunk,
    t.ChunkHash as LeftChunkHash,
    q.Chunk as RightChunk,
    q.ChunkHash as RightChunkHash,
    count(t.ChunkHash) as ChunkCount
from
    chunks as t
    join
    chunks as q
    on
        t.ID = q.ID
group by LeftChunkHash, RightChunkHash

And the following explain table:

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  t   ALL IDIndex NULL    NULL    NULL    17796190    "Using temporary; Using filesort"
1   SIMPLE  q   ref IDIndex IDIndex 4   sotero.t.Id 12  

note the “using temporary; using filesort”.

When this query is run, I quickly run out of RAM (presumably b/c of the temp table), and then the HDD kicks in, and the query slows to a halt.

I thought it might be an index issue, so I started adding a few that sort of made sense:

Table   Non_unique  Key_name    Seq_in_index    Column_name Collation   Cardinality Sub_part    Packed  Null    Index_type  Comment Index_comment
chunks  0   PRIMARY 1   ChunkId A   17796190    NULL    NULL        BTREE       
chunks  1   ChunkHashIndex  1   ChunkHash   A   243783  NULL    NULL        BTREE       
chunks  1   IDIndex 1   Id  A   1483015 NULL    NULL        BTREE       
chunks  1   ChunkIndex  1   Chunk   A   243783  NULL    NULL        BTREE       
chunks  1   ChunkTypeIndex  1   ChunkType   A   2   NULL    NULL        BTREE       
chunks  1   chunkHashByChunkIDIndex 1   ChunkHash   A   243783  NULL    NULL        BTREE       
chunks  1   chunkHashByChunkIDIndex 2   ChunkId A   17796190    NULL    NULL        BTREE       
chunks  1   chunkHashByChunkTypeIndex   1   ChunkHash   A   243783  NULL    NULL        BTREE       
chunks  1   chunkHashByChunkTypeIndex   2   ChunkType   A   261708  NULL    NULL        BTREE       
chunks  1   chunkHashByIDIndex  1   ChunkHash   A   243783  NULL    NULL        BTREE       
chunks  1   chunkHashByIDIndex  2   Id  A   17796190    NULL    NULL        BTREE       

But still using the temporary table.

The db engine is MyISAM.

How can I get rid of the using temporary; using filesort in this query?

Just changing to InnoDB w/o explaining the underlying cause is not a particularly satisfying answer. Besides, if the solution is to just add the proper index, then that’s much easier than migrating to another db engine.

I am new to relational databases. So I’m hoping that the solution is something obvious to the experts.

EDIT1:

ID is not the primary key. ChunkID is. There are approximately 40 ChunkIDs for each ID. So adding an additional ID to the table adds about 40 rows. Each unique chunk has a unique chunkHash associated with it.

EDIT2:

Here’s the schema:

Field   Type    Null    Key Default Extra
ChunkId int(11) NO  PRI NULL    
ChunkHash   int(11) NO  MUL NULL    
Id  int(11) NO  MUL NULL    
Chunk   varchar(255)    NO  MUL NULL    
ChunkType   varchar(255)    NO  MUL NULL    

EDIT 3:

The end objective of the query is to create a table of word co-occurrences across documents. ChunkIDs are word instances. Each instance is a word that is associated with a particular document (ID). About 40 words present per document. About 1 million documents. So the resulting table of co-occurrences is highly compressed compared to the full cross-product temporary table that is (apparently) being created. That is, the full cross-product temp table is 1 mil * 40 * 40 = 1.6 billion rows. The compressed resulting table is estimated at about 40 million rows.

EDIT 4:

Adding postgresql tag to see if any postgresql users can get a better execution plan on that SQL implementation. If that’s the case, I’ll switch over.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-15T09:24:45+00:00Added an answer on June 15, 2026 at 9:24 am

    I migrated from MySQL to PostgreSQL, and query execution time went from ~1.5 days to ~10 mins.

    Here’s the PostgreSQL query execution plan:

    enter image description here

    I am no longer using MySQL.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have the following query: SELECT COUNT(*) FROM FirstTable ft INNER JOIN SecondTable st
I have the following query: SELECT DISTINCT w.name, count(*) FROM widgets AS w JOIN
I have the following query: SELECT dev.DeviceName, Count(dom.DomainID) AS CountOfDomains FROM tblDevices dev JOIN
i have following query SELECT *, count(jx_commissions.commission_amount) AS summe FROM jx_members INNER JOIN jx_commissions
I have following SQL query: SELECT Count(*) AS CountOfRecs FROM tblAccount INNER JOIN tblAccountOwner
I have this following query: select teams.`team_name`, count(matches.`played_team`) from teams join matches on teams.`team_number`
I have the following query SELECT e.topicShortName, d.catalogFileID, e.topicID FROM catalog_topics a LEFT JOIN
I have the following query SELECT * FROM attend RIGHT OUTER JOIN noattend ON
I have the following query SELECT COUNT(*) FROM Table1 WHERE Column1 IN (SELECT Column1
I have the following query: SELECT `people`.`surename`, calcWeight(`people`.`surname`,'kiera',6) as `weight` FROM `people` LEFT JOIN

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.