I have a Node server that does the following:
I have a list of URLs in an external server, call it URLServer. When a user hits my NODE server, my node server makes a request to the URLServer and gets a list of say 20 URLs. As soon as we get those 20 URLs, I want my node server to go and get the title for each of these URLs, which means that I will fetch the URLs and create a DOM and then extract the title, I also get other data, so this is the way it has to get done. Once I have done that, I want the title of the URLs and the URLs to be saved in internal memory and/or database. So I have a URL-cache and a title-cache (I don’t want to fetch the URLs all the time).
I have something like this:
if(URL-cache is empty) get URLS from URLServer and cache these URLs
I then want to check each of those URLs to see if their titles are in my cache, so I do:
for each URL
if title-cache[URL], good
else fetch site, create DOM, extract title + other data and cache
This works great for one user, but I when I try a heavy load in the server, the server will hang. I have concluded the server hangs for the following reason:
User 1 Request – Empty Caches – Fetch URLs and when done fetch Content for URLs
User 2 Request – The caches still look empty to this user because the request for user 1 has not yet completed!!! Therefore, User 2 forces once again a fetch of the URLs and their respective content.
User 3 Request – User 1 and User 2 requests are not yet completed so the same issue…
So, assuming I have 10 URLs I need to fetch, instead of opening 10 connections, one per URL and then caching the data, if I have 20 users hitting the server at the exact same time, I will be opening 200 connections (each user opens 10 connections).
How can I block User X (where X>1) from causing these events? I basically want the server to close a gate and ask every user to wait until it has populated the caches, then opening the gates once these are populated, is there any way to do this?
This can be done by using EventEmitter class.
You set up an EventEmitter
Then you handle your incoming requests
And wrap your search function to emit events
This method will answer request either immidiately if the results are in cache or will make them wait for the results from search and then return the results.
Since we keep a record of which searches are initiated then we won’t start searching for the same url many times and every request will get the response as soon as results come available.