For a table with 100% reading (no writing), which structure is better and why?
[My table has many columns, but I’ve made an example here with 4 columns for simplicity]
Option 1: One table with multiple columns
ID | Length | Width | Height
-----------------------------------------
1 | 10 | 20 | 30
2 | 100 | 200 | 300
Option 2: Two tables; one storing column headers, and other storing values
Table 1:
ID | Object_ID | Attribute_ID | Attribute_Value
------------------------------------------
1 | 1 | 1 | 10
2 | 1 | 2 | 20
3 | 1 | 3 | 30
4 | 2 | 1 | 100
5 | 2 | 2 | 200
6 | 2 | 3 | 300
Table 2:
ID | Name
-------------------
1 | Length
2 | Width
3 | Height
I will preface this by saying that I’m a relative novice to SQL and database tables; that, however, doesn’t mean that I don’t know my basics.
Unless your example is heavily oversimplified, you really should use the first example. Not only will it be faster and easier to query, but it simply makes more sense.
In this example, you don’t need to split your tables at all; your ‘Attribute IDs’ are adequately represented by the table headers. Further, these values have no real meaning by themselves, so they really don’t need to be in another table.
You would generally break out a new table and reference it as you have if you had another object, existing separately, relating to your object with a one-to-many relationship.
Here’s an example (actually from my database on an O’Reilly server) using blog entries and comments on blog entries:
Think about it from a logical perspective; there’s no reason to artificially inject complexity into this design when it doesn’t need to be there. In your example, length, width, and height aren’t really separate objects, and they’re all related to the dimensions of the object you’re describing in the table row. Further, length width and height only have one value at a given time.
I hope that made some sense – if I was a bit pedantic in my pedagogy, I apologize. However, if someone else stumbles on this question, hopefully this example will help them.
Good luck.
Edit: I just realized that your question was specifically about performance. That’s a little more in-depth, perhaps based on the db engine that you use? Generally, though, I would imagine that querying a table without doing any joins would be slightly faster, considering that denormalization is a commonly-cited method of improving performance.