Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9000825
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T00:21:30+00:00 2026-06-16T00:21:30+00:00

I have a static database of ~60,000 rows. There is a certain column for

  • 0

I have a static database of ~60,000 rows. There is a certain column for which there are ~30,000 unique entries. Given that ratio (60,000 rows/30,000 unique entries in a certain column), is it worth creating a new table with those entries in it, and linking to it from the main table? Or is that going to be more trouble than it’s worth?

To put the question in a more concrete way: Will I gain a lot more efficiency by separating out this field into it’s own table?

** UPDATE **

We’re talking about a VARCHAR(100) field, but in reality, I doubt any of the entries use that much space — I could most likely trim it down to VARCHAR(50). Example entries: “The Gas Patch and Little Canada” and “Kora Temple Masonic Bldg. George Coombs”

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T00:21:31+00:00Added an answer on June 16, 2026 at 12:21 am

    If the field is a VARCHAR(255) that normally contains about 30 characters, and the alternative is to store a 4-byte integer in the main table and use a second table with a 4-byte integer and the VARCHAR(255), then you’re looking at some space saving.

    Old scheme:

    T1: 30 bytes * 60 K entries = 1800 KiB.
    

    New scheme:

    T1:  4       bytes * 60 K entries =  240 KiB
    T2: (4 + 30) bytes * 30 K entries = 1020 KiB
    

    So, that’s crudely 1800 – 1260 = 540 KiB space saving. If, as would be necessary, you build an index on the integer column in T2, you lose some more space. If the average length of the data is larger than 30 bytes, the space saving increases. If the ratio of repeated rows ever increases, the saving increases.

    Whether the space saving is significant depends on your context. If you need half a megabyte more memory, you just got it — and you could squeeze more if you’re sure you won’t need to go above 65535 distinct entries by using 2-byte integers instead of 4 byte integers (120 + 960 KiB = 1080 KiB; saving 720 KiB). On the other hand, if you really won’t notice the half megabyte in the multi-gigabyte storage that’s available, then it becomes a more pragmatic problem. Maintaining two tables is harder work, but guarantees that the name is the same each time it is used. Maintaining one table means that you have to make sure that the pairs of names are handled correctly — or, more likely, you ignore the possibility and you end up without pairs where you should have pairs, or you end up with triplets where you should have doubletons.

    Clearly, if the type that’s repeated is a 4 byte integer, using two tables will save nothing; it will cost you space.

    A lot, therefore, depends on what you’ve not told us. The type is one key issue. The other is the semantics behind the repetition.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a C# database layer that with static read method that is called
I have a database that I want to add a column to, I changed
In PHP, I have following Singleton Database Class: class Database { private static $instance;
I have two databases in SQL2k5: one that holds a large amount of static
I have static method which returns me as it's name says data from domain
Suppose I have static ip in a subnet that has DHCP server. If i
I intended to create a class which only have static members and static functions.
I have a table with 25.000 rows (columns with strings). I would like to
I have a data table with 600,000 records that is around 25 megabytes large.
I have a static SessionFactory class that initializes an NHibernate session factory. Because this

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.