I’m doing a research project on a popular dating site called: OkCupid
I would like to talk about how the database is used.
After reading this explanation from the Co-Founder of the site, I became very confused.
The statements:
when a user performs a match search on OkCupid, we have to do the
following:-Retrieve (from somewhere not the DB) their question answers, their ideal match’s answers, and their question importances. On average,
each user on OkCupid has 250 questions answered in 3 parts.
-Figure out who qualifies for their search, typically a very complicated query across a few million users. On average, tens of
thousands of people qualify, and we need to figure out who they are
without hitting the DB.
How do they do all this without consulting a database??
Here is a link to the post
I appreciate any explanations as to how they do things there
For that frequently accessed information, it may be backd by a database but also stored in a distributed memory cache (e.g. Memcached).
When a user updates their answer or answers a new question, it would update the cache and the database so the database never has to get queried.
They could somehow access a user’s answers or specific answers based on a cache key and asynchronously query multitudes of users and compare answers.
Just a guess though.