I have a small regular expression to split an integer value into 1000 separated and I was just wondering how it works.
Here’s a perl code.
$intval = 10000;
$intval =~ s/([+-]?\d)(?=(\d{3})+(?!\d))/$1,/go;
print $intval;
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
(...)Any normal set of parentheses is a capture group that can be referenced after the regex matches, using the special variables$1,$2, etc.[+-]?Brackets create a character group, meaning “match any of these characters”.?means “match zero or one times”. So this allows for the possibility of a single + or – at the beginning of the match.\dMatch a single digit.(?=...)this is a look-ahead. It will require that everything contained in the pattern matches, but not include this in the output “match”. Nor will it move the position in the string forward (this means that matches can overlap when using lookahead).(\d{3})+match one or more groups of three digits.(?!\d)the stuff that has matched cannot be followed by another digit./$1,/Replace what matched (remember, this does not include the lookahead portion because that doesn’t count as part of the match) with the first capture group, followed by a comma.gothese flags are options setting the regex behavior:gmeans it repeats until if finds and replaces all matches.ois an optimization telling the interpreter to compile the pattern only once, but it is largely obsolete and makes absolutely no difference in this case since nothing is interpolated into the pattern.So this regex will replace a single digit, followed by a number of digits that is a multiple of three, with that digit followed by a comma. It runs repeatedly, finding all matches. The effect of this is to insert commas as thousand separators.
One quibble: the
[+-]?part is completely unnecessary. Because the regex has no requirements about what comes before a number, a number with a + or – will work just fine even if this part is removed.