Recently I have been trying to optimize my tables, mainly because I’ve learned alot more about database design through some courses at my school. I also chose to do this because I’ve been getting alot of timeouts on some queries, and lately have found out that it was indeed my bad database designing.
So basically, I will be doing SELECT, UPDATE, INSERT and DELETE on this table.
Here is my current database schema:
-- ----------------------------
-- Table structure for `characters_items`
-- ----------------------------
DROP TABLE IF EXISTS `characters_items`;
CREATE TABLE `characters_items` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`master_id` int(10) unsigned NOT NULL DEFAULT '0',
`item_id` smallint(6) NOT NULL,
`amount` int(11) NOT NULL,
`slot_id` smallint(9) NOT NULL DEFAULT '0',
`type` tinyint(4) NOT NULL DEFAULT '0',
`extra_data` text,
PRIMARY KEY (`id`),
KEY `master_id` (`master_id`),
CONSTRAINT `characters_items_ibfk_1` FOREIGN KEY (`master_id`) REFERENCES `characters` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=904 DEFAULT CHARSET=latin1;
In my program, I will be manipulating large amounts (up to 500 rows at a time, as you can see this is a table for all character items).
I also learned that indexing values will slow your queries down, if you are doing data manipulation.
Here is some queries that I will be using:
StringBuilder query = new StringBuilder();
client.ClearParameters();
client.AddParameter("master_id", this.owner.MasterId);
client.AddParameter("type", (byte)CharacterItemType.Bank);
client.AddParameter("capacity", this.Capacity);
// Grab the original items.
DataRow[] data = client.ReadDataTable("SELECT item_id,amount,slot_id FROM characters_items WHERE master_id=@master_id AND type=@type LIMIT @capacity").Select();
Item[] originalItems = new Item[this.Capacity];
if (data != null && data.Length > 0)
{
for (short i = 0; i < data.Length; i++)
{
DataRow row = data[i];
short id = (short)row[0];
int count = (int)row[1];
short slotId = (short)row[2];
originalItems[slotId] = new Item(id, count);
}
}
// Now we compare the items to see if anything has been changed.
Item[] items = this.ToArray();
for (short i = 0; i < items.Length; i++)
{
Item item = items[i];
Item original = originalItems[i];
// item was added.
if (item != null && original == null)
{
query.Append("INSERT INTO characters_items (master_id,item_id,amount,slot_id,type,extra_data) ");
query.Append("VALUES (");
query.Append(this.owner.MasterId);
query.Append(",");
query.Append(item.Id);
query.Append(",");
query.Append(item.Count);
query.Append(",");
query.Append(i);
query.Append(",");
query.Append((byte)CharacterItemType.Bank);
string extraData = item.SerializeExtraData();
if (extraData != null)
{
query.Append(",'");
query.Append(extraData);
query.Append("'");
}
else
{
query.Append(",null");
}
query.Append(");");
}
// item was deleted.
else if (item == null && original != null)
{
query.Append("DELETE FROM characters_items WHERE slot_id=");
query.Append(i);
query.Append(" AND master_id=");
query.Append(this.owner.MasterId);
query.Append(" AND type=");
query.Append((byte)CharacterItemType.Inventory);
query.Append(" LIMIT 1;");
}
// item was modified.
else if (item != null && original != null)
{
if (item.Id != original.Id || item.Count != original.Count)
{
query.Append("UPDATE characters_items SET item_id=");
query.Append(item.Id);
query.Append(",amount=");
query.Append(item.Count);
string extraData = item.SerializeExtraData();
if (extraData != null)
{
query.Append(",extra_data='");
query.Append(extraData);
query.Append("'");
}
else
{
query.Append(",extra_data=null");
}
query.Append(" WHERE master_id=@master_id AND type=@type AND slot_id=");
query.Append(i);
query.Append(";");
}
}
}
// If a query was actually built, we will execute it.
if (query.Length > 0)
{
client.SetConnectionTimeout(60);
client.ExecuteUpdate(query.ToString());
return true;
}
}
catch (Exception ex)
{
Program.Logger.PrintException(ex);
}
return false;
As you can see, I am almost always referencing the slot_id, type, and master_id fields. I was wondering if I made the slot_id and type fields a indexed field, how would it affect my overall data manipulation performance? Will be be affected in a positive way, or in a negative way?
Please give me some advice (except on the C# code, I will be fixing it up later!)
First of all, never construct the SQL text dynamically, when you can use bound parameters instead. Binding parameters protects you from SQL injection and can facilitate better performance by allowing the DBMS to prepare the SQL statement once and reuse it many times.
As for indexes… they are generally a trade-off between finding and modifying data – they speed-up the former1 and slow-down the latter. However, if data is modified in a way that also incorporates a search2, index can actually end-up speeding-up the modification as well.
Indexes should always be tailor-made to the actual queries your application is doing, which in your case includes these:
SELECT ... WHERE master_id=... AND type=...INSERT ...DELETE ... WHERE slot_id=... AND master_id=... AND type=...UPDATE ... WHERE master_id=... AND type=... AND slot_id=...All 3 WHERE clauses can be efficiently “served” by a single composite index on
{master_id, type, slot_id}. Only the INSERT statement (which by its nature doesn’t have WHERE) will be hurt by this additional index.Considerations:
master_idis expected to be low, indexing just on themaster_idwon’t significantly impact the search performance, but will make the index smaller and easier/quicker to maintain.SELECT item_id, amount, slot_id, we could add theitem_idandamountto the “trailing edge” of the index (slot_idis already in the index).As you can see, all this is a pretty elaborate balancing act and even experts can’t always predict the optimal balance. So if in doubt, measure before deciding!
For an excellent introduction on the topic of indexing, and database performance in general, I warmly recommend: Use The Index. Luke!
1 Assuming they are used correctly.
2 Typically:
DELETE ... WHERE ...andUPDATE ... WHERE ....