I’m trying to understand regex as much as I can, so I came up with this regex-based solution to codingbat.com repeatEnd:
Given a string and an int N, return a string made of N repetitions of the last N characters of the string. You may assume that N is between 0 and the length of the string, inclusive.
public String repeatEnd(String str, int N) {
return str.replaceAll(
".(?!.{N})(?=.*(?<=(.{N})))|."
.replace("N", Integer.toString(N)),
"$1"
);
}
Explanation on its parts:
.(?!.{N}): asserts that the matched character is one of the last N characters, by making sure that there aren’t N characters following it.-
(?=.*(?<=(.{N}))): in which case, use lookforward to first go all the way to the end of the string, then a nested lookbehind to capture the last N characters into\1. Note that this assertion will always be true. -
|.: if the first assertion failed (i.e. there are at least N characters ahead) then match the character anyway;\1would be empty. -
In either case, a character is always matched; replace it with
\1.
My questions are:
- Is this technique of nested assertions valid? (i.e. looking behind during a lookahead?)
- Is there a simpler regex-based solution?
Bonus question
Do repeatBegin (as analogously defined).
I’m honestly having troubles with this one!
Nice one! I don’t see a way to significantly improve on that regex, although I would refactor it to avoid the needless use of negative logic:
This way the second alternative is never entered until you reach the final N characters, which I think makes the intent a little clearer.
I’ve never seen a reference that says it’s okay to nest lookarounds, but like Bart, I don’t see why it wouldn’t be. I sometimes use lookaheads inside lookbehinds to get around limitations on variable-length lookbehind expressions.
EDIT: I just realized I can simplify the regex quite a bit by putting the alternation inside the lookahead:
By the way, have you considered using
format()to build the regex instead ofreplace()?