I have to work on CRON which will be sending email to subscriber weekly on the day they get subscribed. For example if user A subscribed on Thursday and user B subscribed on Wednesday then user A will get mail on every Thursday and user B on every Wednesday.
Now my approach will be following:
1- First get the day of the week of current(TODAY) date and assign in a variable
2- Running the SELECT query and fetch all subscriber IDs who’s subscription day’s is similar to the day of Today’s Date. I am planning to use MYSQL’s dayofweek() to extract day from Week,
3- Once getting all IDs then send last 7 day activities to those subscribers via email.
Thing thing which is making me a bit puzzled is DAYOFWEEK() function which column based and looks costly. What alternative would you suggest?(Assuming the table would have lots of data)
Per-row functions rarely scale well as the database table grows.
The first thing you should do is make sure there’s actually a performance problem to solve. Always start with third normal form and regress only if you find such a problem, otherwise your effort is wasted. It may be that the speed is not that bad in which case stick with 3NF.
If it turns out there is a performance problem, one way to solve it is to add and indexed column called weekday that will hold the day of the week the user subscribed.
This is technically breaking 3NF since that attribute is dependent on the date of subscription which is unlikely to be part of the key. It may also come to disagree with that subscription date if you update one or the other independently.
But you can mitigate the problem by having an insert/update trigger which forces the
weekdaycolumn to agree with the subscription date, ensuring that they never disagree.Then your query simply becomes something like:
and then processing each of those subscribers (or as one big honkin’ query if you wish).
The fact that you’re not having to retrieve every row to do a
getWeekDay (subscription_date)and filter the rows should massively improve the query speed.The vast majority of databases are read far more often than written and, by shifting the cost of the calculation to the insert/update, you effectively amortise that cost over all selects.
Assuming your subscribers subscribe for more than a week (since you send out their stuff once a week), that will be more efficient than calculating on the
select.And, although this takes up more space in your table (due to the extra column and index), have a look at the ratio of “My query isn’t fast enough” questions compared to “My database is too big” questions. The former far outweigh the latter.