I’m quite new to Java and I’m facing a situation I can’t solve. I have some html code and I’m trying to run a regular expression to store all matches into an array. Here’s my code:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.regex.PatternSyntaxException;
public class RegexMatch{
boolean foundMatch = false;
public String[] arrayResults;
public String[] TestRegularExpression(String sourceCode, String pattern){
try{
Pattern regex = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);
Matcher regexMatcher = regex.matcher(sourceCode);
while (regexMatcher.find()) {
arrayResults[matches] = regexMatcher.group();
matches ++;
}
} catch (PatternSyntaxException ex) {
// Exception occurred
}
return arrayResults;
}
}
I’m passing a string containing html code and the regular expression pattern to extract all meta tags and store them into the array. Here’s how I instantiate the method:
RegexMatch regex = new RegexMatch();
regex.TestRegularExpression(sourceCode, "<meta.*?>");
String[] META_TAGS = regex.arrayResults;
Any hint?
Thanks!
Firstly, parsing HTML with regular expressions is a bad idea. There are alternatives which will convert the HTML into a DOM etc – you should look into those.
Assuming you still want the “match multiple results” idea though, it seems to me that a
List<E>of some form would be more useful, so you don’t need to know the size up-front. You can also build that in the method itself, rather than having state. For example:It’s possible that there’s something similar to this within the
Matcherclass itself, but I can’t immediately see it…