— I built a simple app that pulls in data (50 items) from a Redis DB and throws it up at localhost. I did an ApacheBench (c = 100, n = 50000) and I’m getting a semi-decent 150 requests/sec on a dual-core T2080 @ 1.73GHz (my 6 y.o laptop), but the proc usage is very disappointing as shown:

Only one core is used, which is as per design in Node, but I think I can nearly double my requests/sec to ~300, maybe even more, if I can use Node.js clusters. I fiddled around quite a bit but I haven’t been able to figure out how to put the code given here for use with my app which is listed below:
var
express = require( 'express' ),
app = express.createServer(),
redis = require( 'redis' ).createClient();
app.configure( function() {
app.set( 'view options', { layout: false } );
app.set( 'view engine', 'jade' );
app.set( 'views', __dirname + '/views' );
app.use( express.bodyParser() );
} );
function log( what ) { console.log( what ); }
app.get( '/', function( req, res ) {
redis.lrange( 'items', 0, 50, function( err, items ) {
if( err ) { log( err ); } else {
res.render( 'index', { items: items } );
}
});
});
app.listen( 8080 );
I also want to emphasize that the app is I/O intensive (not CPU-intensive, which would’ve made something like threads-a-gogo a better choice than clusters).
Would love some help in figuring this out.
Actually, your workload is not really I/O bound: it is CPU bound due to the cost of jade-based dynamic page generation. I cannot guess the complexity of your jade template, but even with simple templates, generating HTML pages is expensive.
For my tests I used this template:
I added 100 dummy strings to the items key in Redis.
On my box, I get 475 req/s with node.js CPU at 100% (which means 50% CPU consumption on this dual core box). Let’s replace:
by:
Now, the result of the benchmark is close to 2700 req/s. So the bottleneck is clearly due to the formatting of the HTML page.
Using the cluster package in this situation is a good idea, and it is straightforward. The code can be modified as follows:
Now the result of the benchmark is close to 750 req/s with 100 % CPU consumption (to be compared with the initial 475 req/s).