I would like to understand the difference between the following 3 regular expressions:
I wanted to display all the lines in a file that consisted only of lowercase alphabets in them.
Here are the 3 regular expressions I tried:
cat filename.txt | grep ^[a-z]*
Regex Description: This will display all the lines starting with 0 or more lowercase letters. So, it will match either of the following:
zapato
113078
OLIVIA
Not exactly, what we wanted.
cat filename.txt | grep ^[a-z]*$
Regex Description: This will display all the lines starting with 0 or more lowercase letters till the end of the line. This matches the following:
fubuki
BALLIN
Kristine
This time there were no results with digits in them.
cat filename.txt | grep ^[a-z]*[a-z]$
Regex Description: This one works well for me. It searches for all the lines starting with 0 or more lowercase letters and it matches it till it finds another lowercase letter. For some reason, this works for me. However, I want to know how this is different from the previous regular expressions.
tonia
ecurby
totonno
Also, when the asterisk () in the regular expression means, 0 or more, then it should include all the results when I write, ^[a-z]
Short explanations of your regular expressions:
Match string starting with 0 or more characters from
[a-z].Matches empty string and every string starting with character of set
[a-z].Match string containing nothing but 0 or more characters from
[a-z].Matches empty string and every string containing only characters of set
[a-z].Match string starting with 0 or more characters from
[a-z]followed by exactly one last character from[a-z].Matches every non-empty string containing only characters of set
[a-z].Use this instead of your current third option:
It is semantically equivalent but simpler.
The expression
x*x(orxx*) is equivalent tox+in regular expressions (withxbeing any expression). The latter is basically just syntactic sugar for either of the former more verbose expressions.Or put differently: while
*means 0 or more,+means 1 or more.