I’m working on a search stored procedure for our existing forums.
I’ve written the following code which uses standard SQL full text indexes, however I’m sure there is a better way of doing it and would like a point in the right direction.
To give some info on how it needs to work, The page has 1 search text box which when clicked will search thread titles, thread descriptions and post text and should return the results with the title matches first, then descriptions then post data.
Below is what I’ve written so far which works but is not elegant or as fast as I would like. To give an example of performance with 20K threads and 80K posts it takes about 12 seconds to search for 5 random words.
ALTER PROCEDURE [dbo].[SearchForums] ( --Input Params @SearchText VARCHAR(200), @GroupId INT = -1, @ClientId INT, --Paging Params @CurrentPage INT, @PageSize INT, @OutTotalRecCount INT OUTPUT ) AS --Create Temp Table to Store Query Data CREATE TABLE #SearchResults ( Relevance INT IDENTITY, ThreadID INT, PostID INT, [Description] VARCHAR(2000), Author BIGINT ) --Create and populate table of all GroupID's This search will return from CREATE TABLE #GroupsToSearch ( GroupId INT ) IF @GroupId = -1 BEGIN INSERT INTO #GroupsToSearch SELECT GroupID FROM SNetwork_Groups WHERE ClientId = @ClientId END ELSE BEGIN INSERT INTO #GroupsToSearch VALUES(@GroupId) END --Get Thread Titles INSERT INTO #SearchResults SELECT SNetwork_Threads.[ThreadId], (SELECT NULL) AS PostId, SNetwork_Threads.[Description], SNetwork_Threads.[OwnerUserId] FROM SNetwork_Threads INNER JOIN SNetwork_Groups ON SNetwork_Groups.GroupId = SNetwork_Threads.GroupId WHERE FREETEXT(SNetwork_Threads.[Description], @SearchText) AND Snetwork_Threads.GroupID IN (SELECT GroupID FROM #GroupsToSearch) AND SNetwork_Groups.ClientId = @ClientId --Get Thread Descriptions INSERT INTO #SearchResults SELECT SNetwork_Threads.[ThreadId], (SELECT NULL) AS PostId, SNetwork_Threads.[Description], SNetwork_Threads.[OwnerUserId] FROM SNetwork_Threads INNER JOIN SNetwork_Groups ON SNetwork_Groups.GroupId = SNetwork_Threads.GroupId WHERE FREETEXT(SNetwork_Threads.[Name], @SearchText) AND Snetwork_Threads.GroupID IN (SELECT GroupID FROM #GroupsToSearch) AND SNetwork_Groups.ClientId = @ClientId --Get Posts INSERT INTO #SearchResults SELECT SNetwork_Threads.ThreadId, SNetwork_Posts.PostId, SNetwork_Posts.PostText, SNetwork_Posts.[OwnerUserId] FROM SNetwork_Posts INNER JOIN SNetwork_Threads ON SNetwork_Threads.ThreadId = SNetwork_Posts.ThreadId INNER JOIN SNetwork_Groups ON SNetwork_Groups.GroupId = SNetwork_Threads.GroupId WHERE FREETEXT(SNetwork_Posts.PostText, @SearchText) AND Snetwork_Threads.GroupID IN (SELECT GroupID FROM #GroupsToSearch) AND SNetwork_Groups.ClientId = @ClientId --Return Paged Result Sets SELECT @OutTotalRecCount = COUNT(*) FROM #SearchResults SELECT #SearchResults.[ThreadID], #SearchResults.[PostID], #SearchResults.[Description], #SearchResults.[Author] FROM #SearchResults WHERE #SearchResults.[Relevance] >= (@CurrentPage - 1) * @PageSize + 1 AND #SearchResults.[Relevance] <= @CurrentPage*@PageSize ORDER BY Relevance ASC --Clean Up DROP TABLE #SearchResults DROP TABLE #GroupsToSearch
I know its a bit long winded but just a nudge in the right direction would be well appreciated.
Incase it helps 80% of the query time is taken up when search posts and according to teh query plan is spent on ‘Clustered Index Scan’ on the posts table. I cant see anyway around this.
Thanks
Gavin
I’d really have to see an explain plan to know where the slow parts were, as I don’t see anything particularly nasty in your code. Very first thing – make sure all your indexes are in good shape, they are being used, statistics are up to date, etc.
One other idea would be to do the search on thread title first, then use the results from that to prune the searches on thread description and post text. Similarly, use the results from the thread description search to prune the post text search.
The basic idea here is that if you find the keywords in the thread title, why bother searching the description and posts? I realize this may not work depending on how you are presenting the search results to the user, and it may not make a huge difference, but it’s something to think about.