I’m building a system which has the potential to require support for 500+ concurrent users, each making dozens of queries (selects, inserts AND updates) each minute. Based on these requirements and tables with many millions of rows I suspect that there will be the need to use database replication in the future to reduce some of the query load.
Having not used replication in the past, I am wondering if there is anything I need to consider in the schema design?
For instance, I was once told that it is necessary to use GUIDs for primary keys to enable replication. Is this true?
What special considerations or best practices for database design are there for a database that will be replicated?
Due to time constraints on the project I don’t want to waste any time by implementing replication when it may not be needed. (I have enough definite problems to overcome at the moment without worrying about having to solve possible ones.) However, I don’t want to have to make potentially avoidable schema changes when/if replication is required in the future.
Any other advice on this subject, including good places to learn about implementing replication, would also be appreciated.
While every row must have a
rowguidcolumn, you are not required to use a Guid for your primary key. In reality, you aren’t even required to have a primary key (though you will be stoned to death for failing to create one). Even if you define your primary key as a guid, not making it therowguidcolumn will result in Replication Services creating an additional column for you. You definitely can do this, and it’s not a bad idea, but it is by no means necessary nor particularly advantageous.Here are some tips: