I’m trying to use the Scanner class in Java to get data from a configuration file. The file’s elements are delimited by whitespace. However, if a phrase or element should be interpreted as a string literal (including whitespace), then double or single-quotes are places around the element. This gives files that look like this:
> R 120 Something AWord
> P 160 SomethingElse "A string literal"
When using the Java Scanner class, it delimits by just whitespace by default. The Scanner class has the useDelimiter() function that takes a regular expression to specify a different delimiter for the text. I’m not good with regular expressions, however, so I’m not sure how I’d do this.
How can I delimit by whitespace, unless there are quotes surrounding something?
You can use the
scanner.findInLine(pattern)method to specify that you want to keep string literals from being split. You just need a regular expression that will match a quote-less token or one in quotes. This one might work:(That regex is extra complicated because it handles escapes inside the string literal.)
Example:
The
findInLinemethod, as the name suggests, only works within the current line. If you want to search the whole input you can usefindWithinHorizoninstead. You can pass0in as the horizon to tell it to use an unlimited horizon: