Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 601977
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 13, 20262026-05-13T16:46:23+00:00 2026-05-13T16:46:23+00:00

What would be the performance penalty of using strings as primary keys instead of

  • 0

What would be the performance penalty of using strings as primary keys instead of bigints etc.? String comparison is much more expensive than integer comparison, but on the other hand I can imagine that internally a DBMS will compute hash keys to reduce the penalty.

An application that I work on uses strings as primary keys in several tables (MySQL). It is not trivial to change this, and I’d like to know what can be gained performance wise to justify the work.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-13T16:46:23+00:00Added an answer on May 13, 2026 at 4:46 pm

    on the other hand I can imagine that
    internally a DBMS will compute hash
    keys to reduce the penalty.

    The DB needs to maintain a B-Tree (or a similar structure) with the key in a way to have them ordered.

    If the key is hashed and stored it in the B-Tree that would be fine to check rapidly the uniqueness of the key — the key can still be looked up efficiently. But you would not be able to search efficient for range of data (e.g. with LIKE) because the B-Tree is no more ordered according to the String value.

    So I think most DB really store the String in the B-Tree, which can (1) take more space than numeric values and (2) require the B-Tree to be re-balanced if keys are inserted in arbitrary order (no notion of increasing value as with numeric pk).

    The penalty in practice can range from insignificant to huge. It all depends on the usage, the number of rows, the average size of the string key, the queries which join table, etc.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm using a GZIPInputStream in my program, and I know that the performance would
Lots of people talk about the performance advantages of String.intern(), but I'm actually more
Is there a performance penalty when using a SecurityManager? I need the following: public
I'm wondering if there's a performance penalty when doing the following vs using plain
For performance what option would be better for large data sets that are to
I would like to know how will performance go when I instantiate multiple classes
For debugging / performance tests I would like to dynamically add logging code to
For performance testing, I would like to capture some traffic from a production server
I would like to know is there any difference in performance between these two
I have a performance sensitive scenario where I would like to write UTF-8 to

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.