Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 259363
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 11, 20262026-05-11T22:15:28+00:00 2026-05-11T22:15:28+00:00

Let’s say I have a large database that consists of products in groups. Let’s

  • 0

Let’s say I have a large database that consists of products in groups. Let’s say that there are 5 groups, each of them has 100,000 products. the product ids are random integers (so are the group ids)

I need to find a product in a specific group. My question is which primary key is more efficient:

  1. (sid, pid)
  2. (pid, sid)

sid, pid is intuitive, but when searching in this order, MySQL will have to isolate 100,000 out of the 500,000 rows and then find a single number in 100,000. On the other hand, (pid, sid) sounds more optimal to me since it will force mysql not to create the large 100,000 group in the first stage, but to go directly to the right item (or up to 5 items if there are similar pids in different cids).

Is #2 indeed faster?

UPDATE:
OK. I copied a real table to two copies. table0 has primary key sid,pid. table1 has pid,sid.

result of query:

explain select * from items0 where sid = 22746 and pid = 2109418034
1, ‘SIMPLE’, ‘items0’, ‘ref’, ‘PRIMARY’, ‘PRIMARY’, ‘8’, ‘const,const’, 14, ”

explain select * from items1 where sid = 22746 and pid = 2109418034

1, ‘SIMPLE’, ‘items1’, ‘ref’, ‘PRIMARY’, ‘PRIMARY’, ‘8’, ‘const,const’, 11, ”

Yet another update:
I also added the two keys to the same table and run explain. got this:
(Primary starts with sid_pid1, Index2 starts with pid1,sid)

1, ‘SIMPLE’, ‘items’, ‘ref’, ‘PRIMARY,index_2’, ‘index_2’, ‘8’, ‘const,const’, 13, ”

I’m not sure, what conclusions can I draw from this test?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-11T22:15:28+00:00Added an answer on May 11, 2026 at 10:15 pm

    The performance of a SQL DBMS query depends GREATLY on a large number of factors – how fragmented the table (or index) is, the freshness and amount of data/index statistics, the size of your data caches/how much CPU/memory, how many rows are in the table, the query construction, etc. etc. etc.

    Although profiling queries is a necessary part of performance tuning it alone is not sufficient — it must be part of a larger query optimization strategy. Saying “test it and see” is not very helpful (and in my opinion sometimes dangerous!) in the general case because of the non-deterministic nature of the query optimization process. One day running it can be just fine, the next slow (or vice versa).

    Without an understanding of the fundamentals of MySQL index construction, what queries will be used, and how queries will use indexes any ad hoc tests are in the best case lucky guesses and in the worst case ticking time bombs.

    In this case there IS a rule of thumb due to the nature of how MySQL B-Trees are constructed. From the MySQL internals page: http://forge.mysql.com/wiki/MySQL_Internals_MyISAM#The_.MYI_file you can see that in the case of a non-unique BTREE index on two columns MySQL will store the concatenated values in the order that you specify. In that specific example they stored ASCII (or UNICODE) but in the case of integer values it will do something similar (open a hex editor and decode the actual values if you are intrepid enough!) ( also ref’d here http://dev.mysql.com/doc/refman/5.0/en/multiple-column-indexes.html ).

    So, the rule of thumb is to put the most selective ( ref http://www.akadia.com/services/ora_index_selectivity.html ) value first because that gives the query processor the most information to narrow down the # of rows to be processed. Placing a less selective key FIRST will force the optimizer to consider more rows and, unless that is what you EXACTLY want, will be suboptimal by design.

    Also to piggy back on what Eric said: MySQL (or other DBMS’) can use any/all keys in increasing fashion to help narrow down the search — e.g. if you place an index on( A, B, C ) then queries that have WHERE A = .. B = can use it (depending), queries that use WHERE A = can use it, but queries that ask for WHERE C = cannot (usually).

    So, it also depends on the nature of your queries — if you always ask for WHERE pid = AND sid = then the most selective one should go first (product ID) but if you often ask for WHERE sid = XXXX by itself, then the sid should go first (OR just create another index for that situation if there’s varying amounts). The trade-off here is for time/space — having an additional index will satisfy a different class of queries at the expense of additional disk space and increased write I/O.

    Finally, if you are using INNODB you can specify a “clustered” index that actually sorts rows on disk (MyISAM tables are basically heaps). If you cluster the rows on disk by sid, pid then it will actually group them together so you can fetch entire BLOCKS (or pages) of products at a time which will use vastly less I/O than BTREEs alone (ref http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/ )

    So, you can see why “test it and see” is useful but without an understanding of MySQL index fundamentals you miss out on a whole class of optimizations.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 124k
  • Answers 124k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer Are you running on OS 3.0? I saw the same… May 12, 2026 at 1:19 am
  • Editorial Team
    Editorial Team added an answer It looks like you need to register Apache::Session::Memcached with Apache::Session::Wrapper,… May 12, 2026 at 1:19 am
  • Editorial Team
    Editorial Team added an answer Use DATENAME or DATEPART: SELECT DATENAME(dw,GETDATE()) -- Friday SELECT DATEPART(dw,GETDATE())… May 12, 2026 at 1:19 am

Related Questions

Let's say you create a wizard in an HTML form. One button goes back,
Let's say I'm building a data access layer for an application. Typically I have
Let's say you have a class called Customer, which contains the following fields: UserName
Let me try to explain what I need. I have a server that is
Let's say we have a simple function defined in a pseudo language. List<Numbers> SortNumbers(List<Numbers>

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.