I have this code below which checks whether the user has entered a syntactically correct url. Regex code was got from Regular expressions in C: examples?
printf("Enter the website URL:\n");
fgets(str, 100, stdin);
if (!strcmp(str, "\n")) {
printf("Empty URL ");
exit(2);
}
regex_t regex;
int reti;
char msgbuf[100];
/* Compile regular expression */
reti = regcomp(®ex, "[a-zA-Z0-9\\-\\.]+\\.[a-zA-Z]{2,3}(/\\S*)?$", 0);
if (reti) {
fprintf(stderr, "Could not compile regex\n");
exit(3);
}
/* Execute regular expression */
reti = regexec(®ex, str, 0, NULL, 0);
if (!reti) {
puts("Match");
} else if (reti == REG_NOMATCH) { //This else if always executes.
puts("No match");
exit(4);
} else {
regerror(reti, ®ex, msgbuf, sizeof (msgbuf));
fprintf(stderr, "Regex match failed: %s\n", msgbuf);
exit(5);
}
/* Free compiled regular expression if you want to use the regex_t again */
regfree(®ex);
However the regex always fails, even if the url entered is correct. I know the regex is correct but for some reason it fails on the ‘Execute regular expression’ part. Even if the user enters a syntactically correct URL the else if always executes.
What could be the reason for the else if always executing?
Your pattern is not valid!
Note that POSIX defines two flavors of Regex: Basic (BRE) and extended (ERE) (see Wikipedia). Since you want to use the “extended” flavor, pass the
REG_EXTENDEDflag toregcomp().Here are (some of?) the problems with your pattern:
[a-zA-Z0-9\\-\\.]+\\.[a-zA-Z]{2,3}(/\\S*)[]), you don’t need to escape special characters. In fact, you cannot escape them and[a-zA-Z0-9\-\.]will match backslashes, but not the hyphen, since\-\is interpreted as the range from\to\. If you want to match the hyphen, place it first or last in the character list:[a-zA-Z0-9.-]\Sis not supported by POSIX. Use[^[:space:]]instead.{}need to be written as\{\}with BRE+and?quantifiers are only supported by ERETo summarize, replace the call to
regcomp()with this one: