We have a large document of text (stored in a MSSQL database) and we need to provide dictionary like lookups for words when they are hovered over.
For instance if there was the following sentence:
“The quick dog jumped over the brown fox” our users could create a “definition” for any of those words such as “quick”, “dog”, “jumped over” etc and we need to highlight such text and when they mouseover provide the text that has been set in the definition.
Currently we have an implementation that does the job however it suffers from incredible bad performance, the current implementation uses Regex to parse the text and insert a snippet of Javascript right after the word in the text if it matches a definition. Now seeings as we can have anywhere up to 400 or more definitions and the text can be several paragraphs long or more this hangs the entire server and makes the app non-repsonsive.
I have tried to optimise the code by fiddling with compiled regex but it doesn’t help the problem much, the request still times out before returning anything.
I’m curious as to what other options I have to achieve this.
I have considered:
- Writing a service that sits in the background and polls the
definitions and updates the text at idle times - Some form of caching, however this isn’t really going to fix the root
cause of the problem and beings as the site wont load at all the page
probably wont get cached - Implementing the regex client side, I think the page would load then
but I doubt doing this client side would be any better than doing it
server side it may even lock up the browser
The app is an ASP.NET website (.net 3.5 currently moving to 4 soonish), using SQL Server 2005/8 (depending on client site) and NHibernate.
Just throwing out ideas:
Possible algorithm:
text[]words[]System.Collections.Generic.HashSetsince it has a really fast lookuptext[]and tag it with a<span class='known'>if it is inwords[]To handle compound words:
words[]exist before doing a regex search for the compound word.AJAX mouseover event:
<span class='known'><span class='known'>house</span> plan</span>, which is fine. Your jQuery will send the outermost span tag to the server and the server can return all the words that are matched.