Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9157403
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 17, 20262026-06-17T13:00:30+00:00 2026-06-17T13:00:30+00:00

I have a relatively large 4-deep relational data setup like this: ClientApplication has_many =>

  • 0

I have a relatively large 4-deep relational data setup like this:

ClientApplication         has_many => ClientApplicationVersions
ClientApplicationVersions has_many => CloudLogs
CloudLogs                 has_many => Logs

client_applications: (potentially 1,000’s of records)
   – ...
   – account_id
   – public_key
   – deleted_at

client_application_versions: (potentially 10,000’s of records)
   – ...
   – client_application_id
   – public_key
   – deleted_at

cloud_logs: (potentially 1,000,000’s of records)
   – …
   – client_application_version_id
   – public_key
   – deleted_at

logs: (potentially 1,000,000,000’s of records)
   – ...
   – cloud_log_id
   – public_key
   – time_stamp
   – deleted_at


I am still in development so the structure and setup is not set in stone, but I hope it is setup ok. Using Rails 3.2.11 and InnoDB MySQL. The database is currently filled with a small (compared to the eventual db size) set of data (logs only has 500,000 rows) I have 4 scoped queries, 3 of which are problematic, to retrieve logs.

  1. Grab first page of logs, ordered by timestamp, limited by account_id, client_application.public_key, client_application_version.public_key (Over 100 seconds)
  2. Grab first page of logs, ordered by timestamp, limited by account_id, client_application.public_key (Over 100 seconds)
  3. Grab first page of logs, ordered by timestamp, limited by account_id (Over 100 seconds)
  4. Grab first page of logs, ordered by timestamp (~2 seconds)

I am using rails scopes to help make these calls:

  scope :account_id, proc {|account_id| joins(:client_application).where("client_applications.account_id = ?", account_id) }
  scope :client_application_key, proc {|client_application_key| joins(:client_application).where("client_applications.public_key = ?", client_application_key) }
  scope :client_application_version_key, proc {|client_application_version_key| joins(:client_application_version).where("client_application_versions.public_key = ?", client_application_version_key) }

  default_scope order('logs.timestamp DESC')

I have indices on each table on public_key. I have several indices on the logs table including the one that the optimizer prefers to use (index_logs_on_cloud_log_id), but the queries are still taking eons to run.


Here is how I am calling the method in rails console:

Log.account_id(1).client_application_key('p0kZudG0').client_application_version_key('0HgoJRyE').page(1)

… here is what rails turns it into:

SELECT `logs`.* FROM `logs` INNER JOIN `cloud_logs` ON `cloud_logs`.`id` = `logs`.`cloud_log_id` INNER JOIN `client_application_versions` ON `client_application_versions`.`id` = `cloud_logs`.`client_application_version_id` INNER JOIN `client_applications` ON `client_applications`.`id` = `client_application_versions`.`client_application_id` INNER JOIN `cloud_logs` `cloud_logs_logs_join` ON `cloud_logs_logs_join`.`id` = `logs`.`cloud_log_id` INNER JOIN `client_application_versions` `client_application_versions_logs` ON `client_application_versions_logs`.`id` = `cloud_logs_logs_join`.`client_application_version_id` WHERE (logs.deleted_at IS NULL) AND (client_applications.account_id = 1) AND (client_applications.public_key = 'p0kZudG0') AND (client_application_versions.public_key = '0HgoJRyE') ORDER BY logs.timestamp DESC LIMIT 100 OFFSET 0

… and here is the EXPLAIN statement for that query.

+----+-------------+----------------------------------+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------+---------+------------------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table                            | type   | possible_keys                                                                                                                                         | key                                               | key_len | ref                                                                    | rows | Extra                                        |
+----+-------------+----------------------------------+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------+---------+------------------------------------------------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | client_application_versions      | ref    | PRIMARY,index_client_application_versions_on_client_application_id,index_client_application_versions_on_public_key                                    | index_client_application_versions_on_public_key   | 768     | const                                                                  |    1 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | client_applications              | eq_ref | PRIMARY,index_client_applications_on_account_id,index_client_applications_on_public_key                                                               | PRIMARY                                           | 4       | cloudlog_production.client_application_versions.client_application_id  |    1 | Using where                                  |
|  1 | SIMPLE      | cloud_logs                       | ref    | PRIMARY,index_cloud_logs_on_client_application_version_id                                                                                             | index_cloud_logs_on_client_application_version_id | 5       | cloudlog_production.client_application_versions.id                     |  481 | Using where; Using index                     |
|  1 | SIMPLE      | cloud_logs_logs_join             | eq_ref | PRIMARY,index_cloud_logs_on_client_application_version_id                                                                                             | PRIMARY                                           | 4       | cloudlog_production.cloud_logs.id                                      |    1 |                                              |
|  1 | SIMPLE      | client_application_versions_logs | eq_ref | PRIMARY                                                                                                                                               | PRIMARY                                           | 4       | cloudlog_production.cloud_logs_logs_join.client_application_version_id |    1 | Using index                                  |
|  1 | SIMPLE      | logs                             | ref    | index_logs_on_cloud_log_id_and_deleted_at_and_timestamp,index_logs_on_cloud_log_id_and_deleted_at,index_logs_on_cloud_log_id,index_logs_on_deleted_at | index_logs_on_cloud_log_id                        | 5       | cloudlog_production.cloud_logs.id                                      |    4 | Using where                                  |
+----+-------------+----------------------------------+--------+-------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------+---------+------------------------------------------------------------------------+------+----------------------------------------------+

This question has 3 parts to it:

  1. Can I optimize my DB with additional indices to help these type of join-dependent sort queries become more performant?
  2. Can I optimize the rails code to help this type of find run in a more performant way?
  3. Am I simply approaching this scoped find the wrong way for large datasets?

UPDATE 1/24/12
As suggested by Geoff and J_MCCaffrey in the answers, I have split the query up into 3 different sections to try and isolate the problem. As expected, it is a problem dealing with the largest table. The MYSQL optimizer handles this differently by using different indices, but the delay persists. Here is the EXPLAIN for this approach.

ClientApplication.find_by_account_id_and_public_key(1, 'p0kZudG0').versions.select{|cav| cav.public_key = '0HgoJRyE'}.first.logs.page(2)
  ClientApplication Load (165.9ms)  SELECT `client_applications`.* FROM `client_applications` WHERE `client_applications`.`account_id` = 1 AND `client_applications`.`public_key` = 'p0kZudG0' AND (client_applications.deleted_at IS NULL) ORDER BY client_applications.id LIMIT 1
  ClientApplicationVersion Load (105.1ms)  SELECT `client_application_versions`.* FROM `client_application_versions` WHERE `client_application_versions`.`client_application_id` = 3 AND (client_application_versions.deleted_at IS NULL) ORDER BY client_application_versions.created_at DESC, client_application_versions.id DESC
  Log Load (57295.0ms)  SELECT `logs`.* FROM `logs` INNER JOIN `cloud_logs` ON `logs`.`cloud_log_id` = `cloud_logs`.`id` WHERE `cloud_logs`.`client_application_version_id` = 49 AND (logs.deleted_at IS NULL) AND (cloud_logs.deleted_at IS NULL) ORDER BY logs.timestamp DESC, cloud_logs.received_at DESC LIMIT 100 OFFSET 100
  EXPLAIN (214.5ms)  EXPLAIN SELECT `logs`.* FROM `logs` INNER JOIN `cloud_logs` ON `logs`.`cloud_log_id` = `cloud_logs`.`id` WHERE `cloud_logs`.`client_application_version_id` = 49 AND (logs.deleted_at IS NULL) AND (cloud_logs.deleted_at IS NULL) ORDER BY logs.timestamp DESC, cloud_logs.received_at DESC LIMIT 100 OFFSET 100
EXPLAIN for: SELECT  `logs`.* FROM `logs` INNER JOIN `cloud_logs` ON `logs`.`cloud_log_id` = `cloud_logs`.`id` WHERE `cloud_logs`.`client_application_version_id` = 49 AND (logs.deleted_at IS NULL) AND (cloud_logs.deleted_at IS NULL) ORDER BY logs.timestamp DESC, cloud_logs.received_at DESC LIMIT 100 OFFSET 100
+----+-------------+------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------+---------+-----------------------------------+------+-------------------------------------------------------------------------------------------------------------------------------------------------+
| id | select_type | table      | type        | possible_keys                                                                                                                                         | key                                                                              | key_len | ref                               | rows | Extra                                                                                                                                           |
+----+-------------+------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------+---------+-----------------------------------+------+-------------------------------------------------------------------------------------------------------------------------------------------------+
|  1 | SIMPLE      | cloud_logs | index_merge | PRIMARY,index_cloud_logs_on_client_application_version_id,index_cloud_logs_on_deleted_at                                                              | index_cloud_logs_on_client_application_version_id,index_cloud_logs_on_deleted_at | 5,9     | NULL                              | 1874 | Using intersect(index_cloud_logs_on_client_application_version_id,index_cloud_logs_on_deleted_at); Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | logs       | ref         | index_logs_on_cloud_log_id_and_deleted_at_and_timestamp,index_logs_on_cloud_log_id_and_deleted_at,index_logs_on_cloud_log_id,index_logs_on_deleted_at | index_logs_on_cloud_log_id                                                       | 5       | cloudlog_production.cloud_logs.id |    4 | Using where                                                                                                                                     |
+----+-------------+------------+-------------+-------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------+---------+-----------------------------------+------+-------------------------------------------------------------------------------------------------------------------------------------------------+

UPDATE 1/25/12
Here are the indices for all relevant tables:

CLIENT_APPLICATIONS:
  PRIMARY KEY  (`id`),
  UNIQUE KEY `index_client_applications_on_key` (`key`),
  KEY `index_client_applications_on_account_id` (`account_id`),
  KEY `index_client_applications_on_deleted_at` (`deleted_at`),
  KEY `index_client_applications_on_public_key` (`public_key`)

CLIENT_APPLICATION_VERSIONS:
  PRIMARY KEY  (`id`),
  KEY `index_client_application_versions_on_client_application_id` (`client_application_id`),
  KEY `index_client_application_versions_on_deleted_at` (`deleted_at`),
  KEY `index_client_application_versions_on_public_key` (`public_key`)

CLOUD_LOGS:
  PRIMARY KEY  (`id`),
  KEY `index_cloud_logs_on_api_client_version_id` (`api_client_version_id`),
  KEY `index_cloud_logs_on_client_application_version_id` (`client_application_version_id`),
  KEY `index_cloud_logs_on_deleted_at` (`deleted_at`),
  KEY `index_cloud_logs_on_device_id` (`device_id`),
  KEY `index_cloud_logs_on_public_key` (`public_key`),
  KEY `index_cloud_logs_on_received_at` (`received_at`)

LOGS:
  PRIMARY KEY  (`id`),
  KEY `index_logs_on_class_name` (`class_name`),
  KEY `index_logs_on_cloud_log_id_and_deleted_at_and_timestamp` (`cloud_log_id`,`deleted_at`,`timestamp`),
  KEY `index_logs_on_cloud_log_id_and_deleted_at` (`cloud_log_id`,`deleted_at`),
  KEY `index_logs_on_cloud_log_id` (`cloud_log_id`),
  KEY `index_logs_on_deleted_at` (`deleted_at`),
  KEY `index_logs_on_file_name` (`file_name`),
  KEY `index_logs_on_method_name` (`method_name`),
  KEY `index_logs_on_public_key` (`public_key`),
  KEY `index_logs_on_timestamp` USING BTREE (`timestamp`)
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-17T13:00:32+00:00Added an answer on June 17, 2026 at 1:00 pm

    I am writing this a possible solution to my own question, in the hopes that a better answer will come by. Currently, the DB is setup completely and by-the-book relational.

    ClientApplication         has_many => ClientApplicationVersions
    ClientApplicationVersions has_many => CloudLogs
    CloudLogs                 has_many => Logs
    

    This means that when I need to find Logs that belong to a Client Application, I have to do 3 extra joins just to get it. By introducing some foreign_key denormalization to the Logs table, I can skip all the joins:

    ClientApplication         has_many => ClientApplicationVersions
    ClientApplication         has_many => Logs
    ClientApplicationVersions has_many => CloudLogs
    ClientApplicationVersions has_many => Logs
    CloudLogs                 has_many => Logs
    

    The end result is that I would have a few extra columns in my Logs table: client_application_key, client_application_version_key, and cloud_log_key.

    Although I risk inconsistent data, I can avoid the 3 joins here that are helping to kill the performance of the queries. Somebody please talk me out of this.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a relatively large table (5,208,387 rows, 400mb data/670mb index), all columns i
I have this relatively large numerical application code that may run for a few
I am relatively new to cassandra and its data model. I have a large
I have a relatively large set of data that lends itself very naturally to
I have this relatively large table in a separate filegroup (2 GB, well, it's
I have some relatively large legacy method that I would like to refactor. It
I have a relatively large database (130.000+ rows) of weather data, which is accumulating
I have quite a relatively large number of data, it has 80 columns and
I have a relatively large dictionary in Python and would like to be able
I have a relatively large csv/text data file (33mb) that I need to do

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.