This is my Java 1.5 code (complete example):
import org.junit.Test;
import static org.junit.Assert.*;
import java.util.Scanner;
import java.util.regex.Pattern;
public class StrangeTest {
@Test
public void testRegExp() {
Pattern re = Pattern.compile("(;|:)[^:;]*");
Scanner scanner = new Scanner(":alpha");
scanner.useDelimiter("");
assertEquals(":alpha", scanner.next(re)); // failure
}
}
What is wrong here?
Basically your regular expression matches any string that starts with a
:, even if it is only one character::matches the expression as well as:a,:al,…:alpha. Even:alpha;betais a match!With the question mark you appended to your expression you made it non-greedy, i.e. the shortest possible string is matched, which is
:.Remove the question mark to make it greedy:
However, then it will match
:alpha;beta, so you need to indicate that, following the semicolon or colon character, you expect any characters except the semicolon or colon: