I’m having a hard time figuring how to query/index a database. The situation is

Question

0

Asked: June 17, 20262026-06-17T06:33:52+00:00 2026-06-17T06:33:52+00:00

I’m having a hard time figuring how to query/index a database. The situation is

0

I’m having a hard time figuring how to query/index a database.

The situation is pretty simple. Each time a user visits a category, his/her visit date is stored. My goal is to list the categories in which elements have been added after the user’s latest visit.

Here are the two tables:

CREATE TABLE `elements` (
  `category_id` int(11) NOT NULL,
  `element_id` int(11) NOT NULL,
  `title` varchar(255) NOT NULL,
  `added_date` datetime NOT NULL,
  PRIMARY KEY (`category_id`,`element_id`),
  KEY `index_element_id` (`element_id`)
)

CREATE TABLE `categories_views` (
  `member_id` int(11) NOT NULL,
  `category_id` int(11) NOT NULL,
  `view_date` datetime NOT NULL,
  PRIMARY KEY (`member_id`,`category_id`),
  KEY `index_element_id` (`category_id`)
)

Query:

SELECT
    categories_views.*,
    elements.category_id
FROM
    elements
    INNER JOIN categories_views ON (categories_views.category_id = elements.category_id)
WHERE
    categories_views.member_id = 1
    AND elements.added_date > categories_views.view_date
GROUP BY elements.category_id

Explained:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: elements
         type: ALL
possible_keys: PRIMARY
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 89057
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: categories_views
         type: eq_ref
possible_keys: PRIMARY,index_element_id
          key: PRIMARY
      key_len: 8
          ref: const,convert.elements.category_id
         rows: 1
        Extra: Using where

With about 100k rows in each table, the query is taking around 0.3s, which is too long for something that should be executed for every user action in a web context.

If possible, what indexes should I add, or how should I rewrite this query in order to avoid using filesorts and temporary tables?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T06:33:54+00:00

If each member has a relatively low number of category_views, I suggest testing a different query:

SELECT v.*
  FROM categories_views v
 WHERE v.member_id = 1
   AND EXISTS 
       ( SELECT 1
           FROM elements e
          WHERE e.category_id = v.category_id
            AND e.added_date > v.view_date
       )

For optimum performance of that query, you’d want to ensure you had indexes:

... ON elements (category_id, added_date)

... ON categories_views (member_id, category_id)

NOTE: It looks like the primary key on the categories_views table may be (member_id, category_id), which means an appropriate index already exists.

I’m assuming (as best as I can figure out from the original query) is that the categories_views table contains only the “latest” view of the category for a user, that is, member_id, category_id is unique. It looks like that has to be the case, if the original query is returning a correct result set (if its only returning categories that have “new” elements added since the “last view” of that category by the user; otherwise, the existence of any “older” view_date values in the categories_views table would trigger the inclusion of the category, even if there were a newer view_date that was later than the latest (max added_date) element in a category.

If that’s not the case, i.e. (member_id,category_id) is not unique, then the query would need to be changed.

The query in the original question is a bit puzzling, it references element_views as a table name or table alias, but that doesn’t appear in the EXPLAIN output. I’m going under the assumption that element_views is meant to be a synonym for categories_views.

For the original query, add a covering index on the elements table:

 ... ON elements (category_id, added_date)

The goal there is to get the explain output to show “Using index”

You might also try adding an index:

 ... ON categories_views (member_id, category_id, added_date)

To get all the columns from the categories_view table (for the select list), the query is going to have to visit the pages in the table (unless there’s an index that contains all of those columns. The goal would be reduce the number of rows that need to be visited on data pages to find the row, by having all (or most) of the predicates satisfied from the index.

Is it necessary to return the category_id column from the elements table? Don’t we already know that this is the same value as in the category_id column from the categories_views table, due to the inner join predicate?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m having a hard time figuring how to query/index a database. The situation is

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply