I have a question that was brought up in a recent discussion with a co-worker.
Let’s assume that you have a single table with 100,000,000 rows, and you each of those rows has an indexed column (varchar). There are 1000 unique values for this column, therefore each value has 100,000 rows related to it. I want to find all of the rows that are related to the one of the unique values (I will supply said value), but with some additional filtering logic as well (not important).
Would it be faster, slower, or just as fast to store 100,000 rows in 1000 different tables, and search only the table I need, or use the method listed above?
Assume all of the tables would have the same schema.
Seaching just the table you need will be faster. It’s like asking is it faster to search for ChapterX of a book, or just be given a book that is only ChapterX.
This is, however, mis-leading. As how will you determine which table to query? 1000 IF statements, or a binary tree of IF statements so as to get there in 10 hops? No matter what you write, I would NOT expect any of them to be faster than using the Index of the unified table.
Not to mention the clutter of 1000 tables.
There is an argument (and a time & place) for partitioning data, but this is a very poor example.