I have the talbe like that:
CREATE TABLE UserTrans (
`id` int(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`user_id` int(10) unsigned NOT NULL,
`transaction_id` varchar(255) NOT NULL default '0',
`source` varchar(100) NOT NULL,
PRIMARY KEY (`id`),
KEY `user_id` (`user_id`)
)
with innodb engine.
The transaction_id is var because sometimes it can be aphanumeric.
the id is the primary key.
so.. here is the thing, I have over 1M records. However, there is a query to check for duplicate transaciton_id on the specified source. So, here is my query:
SELECT *
FROM UserTrans
WHERE transaction_id = '212398043'
AND source = 'COMPANY_A';
this query getting very slow, like 2 seconds to run now. Should I index the transaction_id and the source?
e.g. KEY join_id (transaction_id, source)
What is the drawback if i do that?
Obviously the benefit is that it will improve the performance of certain queries.
The drawback is that it will take a bit of space to store the index and a bit of work for the RDBMS to maintain the index. The index is especially prone to consume space because your transaction_id is such a wide string.
You might consider whether transaction_id really needs to be up to 255 characters long, or if you could declare its max length to be something shorter.
Or you could use a prefix index to index only the first n characters:
@Daniel has a good point that you might get the same benefit and save even more space by indexing only one column. Since you’re doing
SELECT *you’ve ruled out the benefit of a covering index.Also if you intend transaction_id to be unique, why not constrain it to be unique?