I am not a beginner to regular expressions, but their use in perl seems a bit different than in Java.
Anyways, I basically have a dictionary of shorthand words and their definitions. I want to iterate over words in the dictionary and replace them with their meanings. what is the best way to do this in JAVA?
I have seen String.replaceAll(), String.replace(), as well as the Pattern/Matcher classes. I wish to do a case insensitive replacement along the lines of:
word =~ s/\s?\Q$short_word\E\s?/ \Q$short_def\E /sig
While I am at it, do you think that it is best to extract all the words from the string and then apply my dictionary or just apply the dictionary to the string? I know that I need to be careful, because the shorthand words could match parts of other shorthand meanings.
Hopefully this all makes sense.
Thanks.
Clarification:
Dictionary is something like:
lol:laugh out loud, rofl:rolling on the floor laughing, ll:like lemons
string is:
lol, i am rofl
replaced text:
laugh out loud, i am rolling on the floor laughing
notice how the ll wasnt added anywhere
The danger is false positives inside of normal words. “fell” != “felikes lemons”
One way is to split the words on whitespace (do multiple spaces need to be conserved?) then loop over the List performing the ‘if contains() { replace } else { output original } idea above.
My output class would be a StringBuffer
Make your split method smart enough to return word delimiters also:
Then you don’t have to worry about conserving white space – the loop above will just append anything that isn’t a dictionary word to the StringBuffer.
Here’s a recent SO thread on retaining delimiters when regexing.