I am relatively new to Java and I need some help to extract multiple substrings from a string. An example of a string is as given below:
String = "How/WRB can/MD I/PRP find/VB a/DT list/NN of/IN celebrities/NNS '/POS real/JJ names/NNS ?/."
Desired result: WRB MD PRP VB DT NN IN NNS POS JJ NNS
I have a text file with possibly thousands of similar POS-tagged lines that I need to extract the POS tags from and do some calculation based on the POS tags.
I have tried using tokenizer but didn’t really get the result I wanted. I even tried using split() and saving to arrays because I need to store it and use it later and that still didn’t work.
Lastly, I tried using Pattern Matcher and I am having problems with the regex as it return the word with the forward slash.
Regex: [\/](.*?)\s\b
Result: /WRB /MD ....
If there’s a better way to do this, please let me know or if anyone can help me figure out what’s wrong with my regex.
This should work:
Prints:
WRB MD PRP VB DT NN IN NNS POS JJ NNS .