I have a table Items which stores fetched book data from Amazon. This Amazon data is inserted into Items as users browse the site, so any INSERT that occurs needs to be efficient.
Here’s the table:
CREATE TABLE IF NOT EXISTS `items` (
`Item_ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`Item_ISBN` char(13) DEFAULT NULL,
`Title` varchar(255) NOT NULL,
`Edition` varchar(20) DEFAULT NULL,
`Authors` varchar(255) DEFAULT NULL,
`Year` char(4) DEFAULT NULL,
`Publisher` varchar(50) DEFAULT NULL,
PRIMARY KEY (`Item_ID`),
UNIQUE KEY `Item_Data` (`Item_ISBN`,`Title`,`Edition`,`Authors`,`Year`,`Publisher`),
KEY `ISBN` (`Item_ISBN`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT AUTO_INCREMENT=1 ;
Normalizing this table would presumably mean creating tables for Titles, Authors, and Publishers. My concern with doing this is that the insert would become too complex.. To insert a single Item, I’d have to:
- Check for the Publisher in Publishers to SELECT Publisher_ID, otherwise insert it and use mysql_insert_id() to get Publisher_ID.
- Check for the Authors in Authors to SELECT Authors_ID, otherwise insert it and use mysql_insert_id() to get Authors_ID.
- Check for the Title in Titles to SELECT Title_ID, otherwise insert it and use mysql_insert_id() to get Title_ID.
- Use those ID’s to finally insert the Item (which may in fact be a duplicate, so this whole process would have been a waste..)
Does that argue against normalization for this table?
Note: The goal of Items is not to create a comprehensive database of books, so that a user would say “Show me all the books by Publisher X.” The Items table is just used to cache Items for my users’ search results.
Considering your goal, I definitely wouldn’t normalize this.