I need to run Linux-Apache-PHP-MySQL application (Moodle e-learning platform) for a large number of concurrent users – I am aiming 5000 users. By concurrent I mean that 5000 people should be able to work with the application at the same time. “Work” means not only do database reads but writes as well.
The application is not very typical, since it is doing a lot of inserts/updates on the database, so caching techniques are not helping to much. We are using InnoDB storage engine. In addition application is not written with performance in mind. For instance one Apache thread usually occupies about 30-50 MB of RAM.
I would be greatful for information what hardware is needed to build scalable configuration that is able to handle this kind of load.
We are using right now two HP DLG 380 with two 4 core processors which are able to handle much lower load (typically 300-500 concurrent users). Is it reasonable to invest in this kind of boxes and build cluster using them or is it better to go with some more high-end hardware?
I am particularly curious
- how many and how powerful servers are
needed (number of processors/cores, size of RAM) - what network equipment should
be used (what kind of switches,
network cards) - any other hardware,
like particular disc storage
solutions, etc, that are needed
Another thing is how to put together everything, that is what is the most optimal architecture. Clustering with MySQL is rather hard (people are complaining about MySQL Cluster, even here on Stackoverflow).
Once you get past the point where a couple of physical machines aren’t giving you the peak load you need, you probably want to start virtualising.
EC2 is probably the most flexible solution at the moment for the LAMP stack. You can set up their VMs as if they were physical machines, cluster them, spin them up as you need more compute-time, switch them off during off-peak times, create machine images so it’s easy to system test…
There are various solutions available for load-balancing and automated spin-up.
If you can make your app fit, you can get use out of their non-relational database engine as well. At very high loads, relational databases (and MySQL in particular) don’t scale effectively. The peak load of SimpleDB, BigTable and similar non-relational databases can scale almost linearly as you add hardware.
Moving away from a relational database is a huge step though, I can’t say I’ve ever needed to do it myself.