Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9253079
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 18, 20262026-06-18T11:07:27+00:00 2026-06-18T11:07:27+00:00

I face the following issue. I have an extremely big table. This table is

  • 0

I face the following issue. I have an extremely big table. This table is a heritage from the people who previously worked on the project. The table is in MS SQL Server.

The table has the following properties:

  1. it has about 300 columns. All of them have “text” type but some of them eventually should represent other types (for example, integer or datetime). So one has to convert this text values in appropriate types before using them
  2. the table has more than 100 milliom rows. The space for the table would soon reach 1 terabyte
  3. the table does not have any indices
  4. the table does not have any implemented mechanisms of partitioning.

As you may guess, it is impossible to run any reasonable query to this table. Now people only insert new records into the table but nobody uses it. So I need to restructure it. I plan to create a new structure and refill the new structure with the data from the old table. Obviously, I will implement partioning, but it is not the only thing to be done.

One of the most important features of the table is that those fields that are purely textual (i.e. they don’t have to be converted into another type) usually have frequently repeated values. So the actual variety of values in a given column is in the range of 5-30 different values. This induces the idea to make normalization: for every such a textual column I will create an additional table with the list of all the different values that may appear in this column, then I will create a (tinyint) primary key in this additional table and then will use an appropriate foreign key in the original table instead of keeping those text values in the original table. Then I will put an index on this foreign key column. The number of the columns to be processed this way is about 100.

It raises the following questions:

  1. would this normalization really increase the speed of the queires imposing conditions on some of those 100 fields? If we forget about the size needed to keep those columns, whether would there be any increase in the performance due to the substition of the initial text-columns with tinyint-columns? If I do not do any normalization and simply put an index on those initial text columns, whether the performace will be the same as for the index on the planned tinyint-column?
  2. If I do the described normalization, then building a view showing the text values will require joining my main table with some 100 additional tables. A positive moment is that I’ll do those joins for pairs “primary key”=”foreign key”. But still quite a big amount of tables should be joined. Here is the question: whether the performance of the queryes made to this view compare to the performance of the queries to the initial non-normalized table will be not worse? Whether the SQL Server Optimizer will really be able to optimize the query the way that allows taking the benefits of the normalization?

Sorry for such a long text.

Thanks for every comment!

PS
I created a related question regarding joining 100 tables;
Joining 100 tables

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-18T11:07:28+00:00Added an answer on June 18, 2026 at 11:07 am

    You’ll find other benefits to normalizing the data besides the speed of queries running against it… such as size and maintainability, which alone should justify normalizing it…

    However, it will also likely improve the speed of queries; currently having a single row containing 300 text columns is massive, and is almost certainly past the 8,060 byte limit for storing the row data page… and is instead being stored in the ROW_OVERFLOW_DATA or LOB_DATA Allocation Units.

    By reducing the size of each row through normalization, such as replacing redundant text data with a TINYINT foreign key, and by also removing columns that aren’t dependent on this large table’s primary key into another table, the data should no longer overflow, and you’ll also be able to store more rows per page.

    As far as the overhead added by performing JOIN to get the normalized data… if you properly index your tables, this shouldn’t add a substantial amount of overhead. However, if it does add an unacceptable overhead, you can then selectively de-normalize the data as necessary.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Following a tutorial from iTunes U on how to do face detection (the tutorial
I face some problems when I am trying to remove rows from a table
I face the following exception: weblogic.transaction.internal.TimedOutException: Transaction timed out after 300 seconds this is
I face the following problem. I have two large tables with about 80.000 records
Currently I face the following issue - I create a custom ListView ( 1
I just have started to program with java rmi and I face the following
I have the following Fragment, IdentificationFragment I want to have this fragment load an
I have looked around online and decided to use the following @font-face implementation: My
I have generated an @font-face kit from fontsquirrel.com for one of our client's sites.
I face the following exception when i try to use parametrized query with informix

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.