I’m working on a system that needs to keep a log of every view of a page in a MySQL table. The view will only be logged if the visitor hasn’t been to that page before in the last 24 hours. I’m wondering if doing this would be much of a problem in terms of performance and database size.
The site that needs to do this averages about 60,000 unique pageviews a day, so that’s roughly 60,000 new rows added per day (just under 1 every 2 seconds). The table is only 3 columns: i_id, ip_address, timestamp. i_id is a foreign key to another table.
The table will be cleared out at the end of every day using a CRON script.
Will there be any immediate database strain by doing this? For example if the site gets a spike in traffic (it does quite regularly) it could shoot up to over 200,000 pageviews in a day, which means over 2 queries per second.
General convention is to not have constraints (primary, foreign, etc) on an audit table, and certainly not indexes — all of the above will slow insertion.
Bulk insertion would be work considering — batch the inserts to lower the number of connections needed to the database, the amount of time involved with the operations (one vs numerous). Additionally, if transaction logs are written for this — minimize writing to them because the database can be impacted by needing to write to IO if you want to be able to resurrect the database at a point in time.
I don’t see the point of clearing out the records at the end of the day — what about traffic that occurs across two days? MySQL partitioning would likely be a better idea.