I’m building a poll widget using ASP.NET controls and Linq-to-Sql for a high traffic site. The widget is, actually, already built. But, it does not use caching yet.
This poll can work in a multi-poll mode which means that each page load the control will hit the database to find any polls that the current user has not taken. There are also several database hits on the postback: a check to make sure the user has not taken the poll, a hit to write the result to the database, and a final series of hits to tally the results.
Update, I’ve re-worded these questions:
-
Would it be appropriate for a control such as a Poll to hit the database on every page hit? How would this performance scale up to a size of say 20,000 users. Assume the server has 2 servers, a load balancer, modern multiple core cpu, and 2 gig of ram.
-
What type of caching for this scenario would you look to employ? Take into account that for example any number of people could take the poll over any interval of time and the total number of people who have taken the poll is needed to compute the results. More problematically, on every load the code must hit the database to find the polls that the user hasn’t taken.
I’ve some ideas but wanting to get some additional expert feedback. Thanks.
Update:
So, let us go over a scenario for caching. One could cache the Polls (the questions) but would still need to probably hit the database for the PollsTaken (the users responses). One possibility would be to create a shadow, writing both to an in-memory storage and to the database storage.
One could use a refresh scheme to dump the cache when a user submits a successful poll (when it changes). A cookie could be used to prevent multiple-takes, although it would be susceptible to gaming.
I want to go into and see more details on the scheme offered. For example, how you would use output caching, caching the linq-to-sql, etc. Not just generalities.
I would cache the following:
On page load get the list of all poll id’s, filter it with the list of id’s for polls this user have completed, and then request the polls from cache by the id’s.
On postback I would expire the key for the poll, and the key for the user, as well as submitting to the database. On next request, the keys would be missing in the cache and they will be recreated from the database with the updated results. If you want to you can also update the results in cache, directly on the postback but the normal solution is to just expire the keys.
The problem with this is mainly that you are using two webservers, so you cannot just cache these items in memory. A poll updated on one webserver, would not be expired on the other webserver unless you are employing some form of communication between the servers to synchronize their cache.
I would recommend using an external cache, and cases like this I use memcached myself. If you install a memcached server on each of the webserver hosts, and configure the application to use both memcaches, then you will always have a synchronized cache.
For C# you can use the Enyim Memcached Client (http://memcached.enyim.com/) to connect to the server, and the Northscale Memcached Server (http://www.northscale.com/products/memcached.html) for the servers.
Both the Enyim and Northscale tools are free (open source) and both are very stable and very usable in production. And, no, I’m not employed by either company 🙂