Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 769723
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T18:22:56+00:00 2026-05-14T18:22:56+00:00

I have two tables, both having more than 20 million records; table1 is a

  • 0

I have two tables, both having more than 20 million records; table1 is a list of terms, and table2 is a list of keywords that may or may not appear in those terms. I need to identify the terms that contain a keyword.
The ‘term’ field is a VARCHAR(320) and the ‘keyword’ field is a VARCHAR(64).

My current strategy is:

SELECT table1.term, table2.keyword FROM table1 INNER JOIN table2 ON table1.term 
LIKE CONCAT('%', table2.keyword, '%');

This is not working, it takes f o r e v e r.
It’s not the server, afaict (see notes).

How might I rewrite this so that it runs in under a day?

I have entertained in-memory tables, or changing to innodb and making the buffer pool big enough to hold both tables. Unfortunately, each mysql thread is bound to one cpu, but I have 4 cores (well, “8” with hyperthreading); if I could distribute the workload, that would be fantastic.

Notes:

  1. Regarding server optimization: both tables are myisam and have unique indexes on the matching fields; the myisam key buffer is greater than the sum of both index file sizes, and it is not even being fully taxed (key_blocks_unused is … large); the server is a 2x dual core xeon 2U beast with fast sas drives and 8G of ram, tuned for the mysql workload.

  2. I just remembered that I only index the first 80 characters of the ‘term’ field (to save disk space); not sure if this is hurting or helping.

  3. MySQL 5.0.32, Debian Lenny x86_64

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T18:22:57+00:00Added an answer on May 14, 2026 at 6:22 pm

    You want to set up a full-text index, then do a search against that. Right now, your unique index probably isn’t helping the search at all (because of the leading ‘%’ in the search).

    That means, it’s almost certainly running a full scan of table1 for each item in table2. Calling that grossly inefficient is putting it nicely. Building a full-text index is somewhat slow (though probably faster than what you’re doing right now) but once that’s done, the searching should go a lot faster.

    As to what to use to do the full-text indexing: while MySQL has a built-in full-text indexing capability, I doubt it’ll help you a lot — with 20 million rows, its performance is pretty poor (at least in my experience). Sphinx is a bit more work to set up, but is a lot more likely to give you adequate performance.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a database with two tables ( Table1 and Table2 ). They both
I have a requirement. I have two tables say TableA and TableB. Both having
I have two tables, both with start time and end time fields. I need
I have two tables that are joined together. A has many B Normally you
I have two tables containing Tasks and Notes, and want to retrieve a list
I have two tables, one that contains volunteers, and one that contains venues. Volunteers
I have two tables say, t1 and t2 that have t1.t1_id and t2.t2_id as
I have two mysql tables which both have a typeID in common. I am
I have two tables with similar columns. I would simply like to select both
I have two tables. Both contains question id field. I want to get all

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.