What’s a common/best practice for database design when it comes to improving performance on count(1) queries? (I’m currently using SQLite)
I’ve normalized my data, it exists on multiple tables, and for simple things I want to do on a single table with a good index — queries are acceptably quick for my purposes.
eg:
SELECT count(1) from actions where type='3' and area='5' and employee='2533';
But when I start getting into multiple table queries, things get too slow (> 1 second).
SELECT count(1)
from
(SELECT SID from actions
where type='3' and employee='2533'
INTERSECT
SELECT SID from transactions where currency='USD') x;
How should I cache my results? What is a good design?
My natural reaction is to add a table solely for storing rows of cached results per employee?
Edit
Design patterns like
Command Query Responsibility Segregation(CQRS) specifically aim to improve theread sideperformance of data access, often in distributed systems and at enterprise scale.Another pattern commonly associated with CQRS is “Event Sourcing”, which stores, and then allows ‘replay’ of Commands, for various use cases.
The above may be overkill for your scenario, but a very simple implementation of caching at an internal app level, could be via a Sqllite Trigger
Assuming that there are many more ‘reads’ than writes to your
actionsortransactionstables,actionortransactionstables update. One cheap (and nasty) way would be to provide an INSERT, UPDATE and DELETE trigger on theactionandtransactionstable, which would then update the appropriate cache table(s).In addition to a local relational database like
SqlLite, NoSql databases likeMongoDb, Cassandra and Redisare frequently used as alternatives to read side caching in read-heavy environments (depending on the type and format of data that you need to cache). You would however need to handle alternative to synchronize data from your ‘master’ (e.g. SQLLite) database to these cache read stores – triggers obviously won’t cut it here.Original Answer
If you are 100% sure that you are always repeating exactly the same query for the same customer, sure, persist the result.
However, in most other instances, RDBMS usually handles caching just fine.
The INTERSECT with the query
Could be problematic if there are a large number of transaction records with USD.
Possibly you could replace this with a join?
You might just check your indexes however:
For
type=’3′ and area=’5′ and
employee=’2533′
type=’3′ and employee=’2533′
An index on
Actions(Employee, Type)orActions(Employee, Type, Area)would make sense (assuming Employee has highest selectivity, and depending on the selectivity of Type and Area).You can also compare this to an index on Actions(Employee, Type, Area, SID) as a covering index for your second query.
And for the join above, you need an index on
Transactions(SID, Currency)