I’m looking to optimize my setup on EC2. CentOS 6, nginx 1.0.15, php 5.4.4 with php-fpm, xcache 2.0.0, mysql 5.5.24-55-log, redis 2.4.10, EC2 High Cpu XLarge (c1.xlarge 8 cores, 7G ram) for high traffic site, writes on every request. Resulting web request is very small (javascript snippet).
Basically, it is a 100% dynamic environment (insert or update). On every web request, I need to look in memcached for a quick lookup, then log a few attributes with every page request. I have several EC2s around the world helping to serve 600M+ requests per day. The idea is that I log the data and dump it hourly to be processed by some other machines. Each machine has been handling about 20M a day. I’ve tried a few data stores and some notes are as follows:
MySQL
- using hourly tables to for the data so write to log_2012_09_05_11 exclusively for 9/5/2012 at the 11am hour.
- Using ephemeral storage
- MyISAM has proved to be faster than innodb for me. I’ve played with the buffer pool and I always seem to get better performance with myisam. Open to any suggestions on tuning here too, but the queries are fast. Myisam lock time are very tiny.
- I profiled the code using xdebug and under high load, 98% of the time was spent connecting to mysql. I was then able to get better performance by using persistent connections with mysqli.
- Max ~2200 rps, get gateway timeouts and slow response afterward
- Server load max 1 or 2 (8 core machine)
Redis
- I really thought this would be awesome, but it seems like php is the bottleneck.
- Max ~5-600 rps.
- This is with writing keys like this “log_2012_09_05_11_12345”, with 12345 coming from a INCR counter by the hour.
- Saving to disk once every 15 mins (operation took about 2mins if I remember correctly)
How many requests per second can I realistically expect out of this EC2 machine and 100% write scenario? Am I bound by EC2’s disk performance or php or mysql? Can I configure it to use more CPU or better use the resources it’s using?
PHP-FPM
http://pastebin.com/raw.php?i=9n2cpqrq
NGINX (nginx.conf)
http://pastebin.com/raw.php?i=XuVBKr8m
I really think you need to look into splitting out your architecture components onto different systems. For example, you noted that you are running MySQL on ephemeral storage. This seems odd for MySQL in that your data could easily be lost. Have you considered using Amazon RDS?
Also, instead of REDIS have you considering ElasticCache or SimpleDB for your Key-Value store.
I guess my main point is that, if you are dealing with the volume of requests that you are, you really should be breaking up your service stack into multiple tiers that can scale independently of each other.