I’m going to store in mssql database some articles (with XHTML formatting) that will be displayed on an ASP.NET page and I’m going to provide some search feature.
Now I have few questions:
- which db type would be best to store text?
- how search within this field (is LIKE good enough for long text)?
- how perform the search without looking for data in the formatting tags? For instance, when user is searching for “spa” it shouldn’t match span elements (which a simple LIKE would do).
To insert formatted data I will use an ASP.NET control, but I haven’t chosen any yet, usually their output is an XHTML. Maybe you can also recommend a “package” of such a control and DB table structure?
Thanks in advance.
1) If you’re going to be storing text of an arbitrary length, I would use NTEXT all the time.
You can use NVARCHAR(MAX) but there are a number of pro’s and con’s about this choice; a big one is performance and where it stores the data depending on the size. (if its less then 8000bytes it uses the table row, if more then uses the LOB storage).
2) You can use LIKE with NTEXT, but NTEXT also gives you the ability to use Full text indexing.
3) If you use Full text indexing, you can avoid HTML markup, here’s another SO answer on the details of this:
How to ignore html tags in Sql Server 2008 Full Text Search