I’m coming at this from a Node.js perspective where the general consensus seems to be that it shines for IO bound use cases. I’m not trying to build the next Facebook/Twitter but my question is are social networking sites generally I/O bound or CPU bound? Since social networking can encompass such a wide variety of contexts, I will further specify that I am specifically interested in functions such as chatting, instant messaging, following users and status updates. For these types of things do bottlenecks generally occur on the CPU or IO side?
I’m coming at this from a Node.js perspective where the general consensus seems to
Share
I was principle engineer for a site w/ ~20M monthly active users. We were definitely I/O bound and rarely (if ever) worried about app-server performance.
Scaling CPU is trivially easy. You boot a new node up on the network and add it to our load-balancers’ available pool. Scaling I/O, on the other hand, is extremely tricky and expensive because the data has to maintain addressability and reasonable consistency.
On the read side, you can scale by supporting replication. This requires writing software that can tolerate replication delays — the time it takes for data to move from the write DB to the read DB. We had 4 read servers for ever 1 write server, and under high load, the delay can be seconds. We chose to implement our I/O access w/ write-back cacheing, where the cache was powered via memcached and a cluster of high-ram servers.
On the write side, data sharding is necessary in order to sustain the number of concurrent writes. This means you can’t use table joins, and you lose atomicity across shards. Again, you software has to tolerate this.
Also, when dealing with a lot of binary data like photo pictures, a site at scale typically uses a CDN that is optimized for better file-system I/O.
My point is that for typical social networking sites, an order of magnitude more time and money is spent on scaling I/O versus CPU.