Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8340585
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 9, 20262026-06-09T05:13:14+00:00 2026-06-09T05:13:14+00:00

Currently I have around 900,000 entries in the data_article_key_terms table to associate key terms

  • 0

Currently I have around 900,000 entries in the data_article_key_terms table to associate key terms to their respective articles. The goal is to be able to select an arbitrary date range and display the top 15 key terms based on the articles in that date range.

The problem that I’m running in to is that the query that I’m running takes almost 6 seconds, but I need it to be faster than that. I realize that this is relative based on the system that I’m running on and I could use a machine with more power, but I’m trying to optimize it the best I can before I go that route.

I’m using InnoDB as the MySQL storage engine to preserve data integrity. As I understand it MyISAM is faster with count(*), but using that engine is also not an option.

I’ve also considered storing the key term counts in a table based on fixed time ranges, but that ends up being a lot of data to store and keep track of.

Does anyone have a good suggestion on how to optimize this experience?

I have the following tables:

This table stores article information:

CREATE TABLE `data_article` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `connection_id` int(11) NOT NULL,
  `folder_id` int(11) NOT NULL,
  `user_id` int(11) NOT NULL,
  `uid` varchar(100) NOT NULL,
  `date` date NOT NULL,
  `influencer_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `data_article_5930b15a` (`connection_id`),
  KEY `data_article_4e5f642` (`folder_id`),
  KEY `data_article_fbfc09f1` (`user_id`),
  KEY `data_article_43ae76a1` (`influencer_id`),
  KEY `data_article_date` (`date`),
  CONSTRAINT `connection_id_refs_id_b2ae9152` FOREIGN KEY (`connection_id`) REFERENCES `account_connection` (`id`),
  CONSTRAINT `folder_id_refs_id_e343586a` FOREIGN KEY (`folder_id`) REFERENCES `account_folder` (`id`),
  CONSTRAINT `influencer_id_refs_id_45cd3615` FOREIGN KEY (`influencer_id`) REFERENCES `data_influencer` (`id`),
  CONSTRAINT `user_id_refs_id_aca13cc9` FOREIGN KEY (`user_id`) REFERENCES `auth_user` (`id`)
)

This table stores key terms:

CREATE TABLE `data_keyterm` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `term` varchar(100) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `data_keyterm_term` (`term`)
)

This table stores the relationship between articles and key terms:

CREATE TABLE `data_article_key_terms` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `article_id` int(11) NOT NULL,
  `keyterm_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `article_id` (`article_id`,`keyterm_id`),
  KEY `data_article_key_terms_30525a19` (`article_id`),
  KEY `data_article_key_terms_1d848ca4` (`keyterm_id`),
  CONSTRAINT `article_id_refs_id_d87be8f5` FOREIGN KEY (`article_id`) REFERENCES `data_article` (`id`),
  CONSTRAINT `keyterm_id_refs_id_50d233f8` FOREIGN KEY (`keyterm_id`) REFERENCES `data_keyterm` (`id`)
)

This table stores influencers that are associated with the articles:

CREATE TABLE `data_influencer` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(100) NOT NULL,
  `title` varchar(100) NOT NULL,
  `email` varchar(100) NOT NULL,
  `active` tinyint(1) NOT NULL,
  `user_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `data_influencer_fbfc09f1` (`user_id`),
  KEY `data_influencer_name` (`name`),
  CONSTRAINT `user_id_refs_id_b1bb5d4f` FOREIGN KEY (`user_id`) REFERENCES `auth_user` (`id`)
)

This is the SQL statement I’m using to pull the keywords based on a time range, group them, and order them by frequency:

SELECT dk.id, dk.term as term, COUNT(dk.id) as count
FROM data_keyterm dk
INNER JOIN data_article_key_terms dakt ON dakt.keyterm_id = dk.id
INNER JOIN data_article da ON da.id = dakt.article_id
INNER JOIN data_influencer di ON di.id = da.influencer_id
WHERE da.user_id = 1
AND da.date between '2010-08-07' AND '2012-08-07'
AND di.active = True
GROUP BY dk.id
ORDER BY count DESC
LIMIT 15;
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-09T05:13:15+00:00Added an answer on June 9, 2026 at 5:13 am

    Running the inner join with a table with 900,000 records and 3 inner join will be take some time to execute. I think you should try some external search engines like solar to obtain the results in quick time

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I currently have 2 tables, one is idtracker which has around 30,000 rows and
I'm currently have around 100 rows in a table on my website, which include
I currently have several rows (say around 100 for args sake) in a table,
We currently have a dynamically updated network graph with around 1,500 nodes and 2,000
i currently have a method that checks what is around the centre item in
I currently have objects populating the screen randomly and bouncing around the stage. The
I am currently learning about basic networking in java. I have been playing around
We have several product lines built around a common core and currently maintain them
I have developed a few Delphi Win32 (currently using D2007) applications, which revolve around
I am currently messing around with some code for an advertising network, i have

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.