I’ve long been perplexed by the speed of stackoverflow and how quickly the questions/comments load on the page. It seems like the backend db that stores all of this info would be humongus…How is it possible for a question and all of its associated answers to get loaded so quickly?
I’ve never worked in a large-scale db environment before (my background is small-business db like Access, some MySQL)…but I’d imagine the backend db for stackoverflow (simplified) is something like two tables linked by an indexed key, right? Something akin to:
Question Table:
Question_PrimaryKey | QuestionText
Answer Table:
Answer_PrimaryKey | Question_ForeignKey | AnswerText
(linked at Question_PrimaryKey & Question_ForeignKey).
Am I way off in thinking this is how a site like stackoverflow is set up? If so, how on earth are the answers to these questions fetched so quickly and put through to the browser? (it blows my mind, because when I build small intranet sites that use Access as a backend, the performance really starts to deteriorate when the db grows).
Any input would be greatly appreciated. Thanks for your time!
In an extremely simplified way, yes the backend will be like that, though the schema will be more complex involving more table and relationships.
StackOverflow uses SQL Server 2008 which, while similar in many respects to access, is on a whole new level in terms of sophistication and performance. As far as databases go, they don’t get much worse than access (someboday will probably correct me on that).
The performance is very good, but that will be the result of a lot of performance tuning, carefully optimised queries, indexes, schema, partitions etc, along with a lot of caching.