I have a project which generates snapshots of a database, converts it to XML and then stores the XML inside a separate database. Unfortunately, these snapshots are becoming huge files, and are now about 10 megabytes each. Fortunately, I only have to store them for about a month before they can be discarded again but still, a month of snapshots turn out to become real bad for it’s performance…
I think there is a way to improve performance a lot. No, not by storing the XML in a separate folder somewhere, because I don’t have write access to any location on that server. The XML must stay within the database. But somehow, the field [Content] might be optimized somehow so things will speed up…
I won’t need any full-text search options on this field. I will never do any searching based on this field. So perhaps by disabling this field for search instructions or whatever?
The table has no references to other tables, but the structure is fixed. I cannot rename things, or change the field types. So I wonder if optimizations is still possible.
Well, is it?
The structure, as generated by SQL Server:
CREATE TABLE [dbo].[Snapshots](
[Identity] [int] IDENTITY(1,1) NOT NULL,
[Header] [varchar](64) NOT NULL,
[Machine] [varchar](64) NOT NULL,
[User] [varchar](64) NOT NULL,
[Timestamp] [datetime] NOT NULL,
[Comment] [text] NOT NULL,
[Content] [text] NOT NULL,
CONSTRAINT [PK_SnapshotLog]
PRIMARY KEY CLUSTERED ([Identity] ASC)
WITH (PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON,
FILLFACTOR = 90) ON [PRIMARY],
CONSTRAINT [IX_SnapshotLog_Header]
UNIQUE NONCLUSTERED ([Header] ASC)
WITH (PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON,
FILLFACTOR = 90)
ON [PRIMARY],
CONSTRAINT [IX_SnapshotLog_Timestamp]
UNIQUE NONCLUSTERED ([Timestamp] ASC)
WITH (PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON,
FILLFACTOR = 90)
ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Performance isn’t just slow when selecting data from this table but also when selecting or inserting data in one of the other tables in this database! When I delete all records from this table, the whole system is fast. When I start adding snapshots, performance starts to decrease. After about 30 snapshots, performance becomes bad and the risk of connection timeouts increase.
Maybe the problem isn’t in the database itself, although it’s still slow when used through the management tool. (Fast when Snapshots is empty.) I mainly use ASP.NET 3.5 and the Entity Framework to connect to this database and then read the multiple tables. Maybe some performance can be gained here, although that wouldn’t explain why the database is also slow from the management tools and when used through other applications with a direct connection…
The whole system became a lot faster when I replaced the TEXT datatype with the NVARCHAR(MAX) datatype. HLGEM pointed out to me that the TEXT datatype is outdated, thus troublesome. It’s still a question if the datatype of these columns could be replaced this easy with the more modern datatype, though. (Translated: I need to test if the code will work with the altered datatype…)
So, if i would alter the datatype from TEXT to NVARCHAR(MAX), is there anything that would break because of this? Problems that I can expect?
Right now, this seems to solve the problem but I need to do some lobbying before I’m allowed to make this change. So I need to be real sure it won’t cause any (unexpected) problems.