I’m about to start a new web project and I’m a bit stuck on how to structure the whole thing for optimum performance, hope someone can help me here. I will write it with PHP using a MySQL.
The whole thing is going to be an automated booking system, where customers will be able to book for certain days, a bit like the systems of airlines. So there will be a many scheduled events (e.g. every Mon, Fri from March-Sept), a few exceptions (e.g. a single event in February). Those scheduled events will have a few different ‘themes’ with different pricing depending on the theme and the date (e.g. cheap in March-April, expensive in May-Sept). Everything should be changeable (e.g. price, schedule, maximum participants)
I thought about creating three tables:
- event [information about the schedule days, price and maximum participants]
- customer [customer information and a reference to the event id/booking id]
- bookings [contains a count of participants for each day (with reference to event and customers) and also possible exceptions like price-changes, different participants limit, etc]
Hope that’s not to unclear what I’m trying here…
I’m trying to focus only on speed at the calendar which checks the bookings table and outputs whether the day is fully booked already [bookings-table with event-table] or not and what the current price is for it. The administration panel should be quite fast too though, so that it’s easy to list all the bookings or access a certain day.
Now let’s go to the real stuff…
Maybe I’m taking the wrong approach for splitting the data this way, but there are obviously some flaws when it comes to bulk changes as I had to insert many exceptions in the bookings table or create a new entry for the event and change the old one, which is a bit messy in my eyes (especially when you got more than a few exceptions). Anyway I’ll end up with heaps of entries, while I actually wanted to maintain the system a bit more automated like just having a few schedules and letting the PHP do the rest ( guess that’s faster than querying the MySQL).
So what do you think ? I can’t come up with any other seperation of the data. Easiest to code would propably be to instantiate a single DB entry for every day and every event, but I guess my mother could tell that’s lame and slow.
In general, I would agree with Colin Fine’s suggestion of getting the data structure clear and logical, and worrying about performance later – it’s a lot easier to optimize a well-structured, but slow, database than to maintain a prematurely optimized database.
Having said that, your question isn’t entirely clear – but I’d hazard the following ideas.
Firstly, your “events” and “exceptions” concepts sound like they belong together – I would not put them in bookings. In fact, I think “booking” is the intersection between an event and a customer, together with additional information (e.g. status, payment details etc.).
Secondly, you seem to be worrying about the number of events you’re creating; I don’t think you need to. Even if there are 10 events every single day, and you’re scheduling 10 years in advance, you’re still only dealing with a few tens of thousands records – this is nowhere near MySQL’s limits. Also, a lookup in the database will be a lot easier to maintain (and quite possibly faster) than complex logic in PHP linked with lookups for exceptions.
So, I’d consider creating the following tables:
You can populate the “EventInstance” table by looping through whatever date logic is required, and then customizing the records for “exceptions”
This way, the “what’s going on 1 Jan” question can be answered by a single database query, rather than logic to check whether 1 Jan is a Monday, or the first day of a month, or in a leap year or whatever.
I strongly recommend creating a basic data model following the hallowed principles of database design (doffing the cap to the sainted Codd), and if you’re really concerned about performance, populate a database with sample data, and tune the queries you expect to need.
You’ll get far better information, and far more accurate results, by working on actual data and optimizing it, than by worrying about performance in general at this early part of the project.