Acronyms are a pain in my database, especially when doing a search. I haven’t decided if I should accept periods during search queries. These are the problems I face when searching:
- ‘IRQ’ will not find ‘I.R.Q.’
- ‘I.R.Q’ will not find ‘IRQ’
- ‘IRQ.’ or ‘IR.Q’ will not find ‘IRQ’ or ‘I.R.Q.’
etc…
The same problem goes for ellipses (…) or three series of periods.
I just need to know what directions should I take with this issue:
- Is it better to remove all periods when inserting the string to the database?
- If so what regex can I use to identify periods (instead of ellipses or three series of periods) to identify what needs to be removed?
- If it is possible to keep the periods in acronyms, how can it be scripted in a query to find ‘I.R.Q’ if I input ‘IRQ’ in the search field, through MySQL using regex or maybe a MySQL function I don’t know about?
My responses for each question:
Yes and no. You want the database to have the original text. If you want, create a separate field that is “cleaned up” to search against. Here, you can remove periods, make everything lowercase, etc.
If so what regex can I use to identify periods (instead of ellipses or three series of periods) to identify what needs to be removed?
/\.+/
That finds one or more periods in a given spot. But you’ll want to integrate it with your search formula.
Note: regex on a database isn’t known to have high performance. Be cautious with this.
Other note: you may want to use FullText search in MySQL. This also, isn’t known to have high performance with data sets over 1000+ entries. If you have big data and need fulltext search, use Sphinx (available as a MySQL plug-in and RAM-based indexing system).
Yes, by having the 2 fields I described in the first bullet’s answer.