As a hypothetical example, let’s say that I wanted to make an application that displays peoples twitter networks. I would provide an API that would allow a client to query on a single username. That user’s top x tweets would be sent to the client. Then, each person that had been mentioned by the initial person would be scanned. Their top x tweets would be sent to the client. This process would recursively continue, breadth-first, until a pre-defined depth was reached. The client would be receiving the data in real time, displaying statistics such as number of users scanned, number of known users remaining to scan, and a growing list of the tweet data. None of the processing is complicated (regex of small amounts of text), but many, many network requests would be spawned from a single initial request.
I really want the fantastic realtime capabilities of node.js with socket.io, but I feel like this is an abuse of those technologies – they’re not meant for heavy server-side lifting. Is there a more appropriate toolset for what I am trying to accomplish, or a particular way to use these tools to that end? Milewise is doing something similar-ish, but I think that my application would consume significantly more network resources than theirs.
Thanks.
The best network transport which you can get on the web now are WebSockets which offers persistent bi-directional real-time connection between server and client. Although not every browser supports them, socket.io gives you a couple of fallback solutions which may however decrease the network performance when compared to WebSockets as stated in this article:
Apart from network transport, other things may also be important, for example how are you fetching, formating and processing the data on the server side. In node.js heavy CPU bound computations may block processing of other asynchronous operations, therefore these kind of operations should be dispatched to separate threads or processes in order to prevent blocking.