I am trying to write a simple Java function that will take a list of Language Inputs and see if what I obtained from a database query matches. All of the strings in my database have been normalized to make searches easier. Here is an example.
Research Lab A wants participants that have any of the following language inputs (they are separated by the pipe character | ):
{English | English, Spanish | Spanish}
In other words, this lab can take participants that are either monolingual English, monolingual Spanish, or bilingual English and Spanish. This is very straightforward – if they database result returns "English" or "English, Spanish" or "Spanish", my function will find a match.
HOWEVER, my database also marks if a participant only has minimal language input for some language (using the ~ character).
"English, ~Spanish" = participant hears English and a little Spanish
"English, ~Spanish, Russian" = participant hears English, Russian, and a little Spanish
This is where I am having trouble. I want to match something like "English, ~Spanish" with both "English" and "English, Spanish".
I was thinking of just remove/hiding the languages with a marked ~, but if there is a research lab that wants only {English, Spanish}, then "English, ~Spanish" will not match, even though it should.
I also cannot think of how I could use regular expressions to do this task. Any help will be greatly appreciated!
Try this
Code
Explanation
UPDATE
A more generalized form would be something like this:
BTW, doing so you might be messed up, as this pattern would going to match a all Capitalized word. There might be a better option to do so by using this syntax while generating your RegEx pattern.
Where
A,Bwould be name of languages. I think this would be a better approach.