I have a database table with words from a dictionary.
Now I want to select words for an anagram. For example if I give the string SEPIAN it should fetch values like apes, pain, pains, pies, pines, sepia, etc.
For this I used the query
SELECT * FROM words WHERE word REGEXP '^[SEPIAN]{1,6}$'
But this query returns words like anna, essen which have repeated characters not in the supplied string. Eg. anna has two n‘s but there is only one n in the search string SEPIAN.
How can I write my regular expression to achieve this? Also if there are repeated characters in my search string at that time the repeated characters should reflect in the result.
Since MySQL does not support back-referencing capturing groups, the typical solution of
(\w).*\1will not work. This means that any solution given will need to enumerate all possible doubles. Furthermore, as far as I can tell back-references are not valid in look-aheads or look-behinds, and look-aheads and look-behinds are not supported in MySQL.However, you can split this into two expressions, and use the following query:
Not very pretty, but it works and it should be fairly efficient as well.
To support a set limit of repeated characters, use the following pattern for your secondary expression:
Where
Ais your character andXis the number of times it’s allowed.So if you’re adding another
Nto your stringSEPIANN(for a total of 2Ns), your query would become: