I’ve a client testing the full text (example below) search on a new Oracle UCM site.
The random text string they chose to test was ‘test only’. Which failed; from my testing it seems ‘only’ is a reserved word, as it is never returned from a full text search (it is returned from metadata searches).
I’ve spent the morning searching oracle.com and found this which seems pretty comprehensive, yet does not have ‘only’.
So my question is thus, is ‘only’ a reserved word. Where can I find a complete list of reserved words for Oracle full text search (10g)?
Full text search string example;
(<ftx>test only</ftx>)
Update.
I have done some more testing. Seems it ignores words that indicate places or times;
only, some, until, when, while, where, there, here, near, that, who, about, this, them.
Can anyone confirm this? I can’t find this in on Oracle anywhere.
Update 2. Post Answer
I should have been looking for ‘stop’ words not ‘reserved’.
Updated the question title and tags to reflect.
I bet the system is trying to automatically ignore frequently occurring words. That would explain why you cannot find ‘only’ but ‘onnly’ can be found. Can you search for ‘a’, ‘an’, …
The list you gave of words that do not work looks like some very common words that frequently are not the primary words in a sentence. Given this, they are not likely to be words you are searching for on a full text search.
What are the odds that you are looking for an article that includes the word ‘that’ and the inclusion of that word is the only fact you have on the article?
I think I found your list…. Ironically from the wiki page of the last company I started..: http://www.sugarcrm.com/wiki/index.php?title=Overview_of_Full_Text_Stop_Words#Default_Stop_Words_.28for_English.29
Default stopword list:
Update – A nice whitepaper from Oracle that includes how full text searching works can be downloaded from: http://www.oracle.com/technology/products/text/pdf/text_techwp.pdf. They mention the stopwords and the fact that there is a default list, but don’t mention the words themselves.