Having never used awk before on Linux I am attempting to understand how it matches regular expressions. For example in the past based on my experience the regular expression /2/ would match 2 in all of the following lines.
- This will match 2
- This will not match 2
Now if I run the command awk '{if(NR~2)print}' sample.txt which has the contents
- 2 will be matched
- This will not match 2
- 2 may be matched
The line that is matched is This will not match 2 which indicates it is matching the line 2 because if I replace the command with awk '{if(NR~3)print}' sample.txt it matches 2 may be matched. Now if I also run the command awk '{if(NR~/^2$/)print}' sample.txt, the matches the same exact line i.e. line 2.
However the source I am referring to at http://www.youtube.com/watch?feature=player_detailpage&v=Htnno4CHVus#t=502s seems to indicate otherwise.
What am I missing and how is the command awk '{if(NR~2)print}' sample.txt different to that of awk '{if(NR~/^2$/)print}' sample.txt?
The condition
NR~2is checking whether the record number, NR, matches 2. For a 2 or 3 line input file, the expression is equivalent to:Similarly with
NR~3, of course. Try:That will print all lines where the text of the line (
$0) contains a 2. By default, a regular expression matches against the whole line; you could limit it to a particular field with$3 ~ /3/, for example.An
awkprogram consists of patterns and actions, where either the pattern or the action may be omitted.The first line has no pattern; the action in the
{ ... }is executed for each input line (but only some input lines will generate output because of the conditional. All lines that contain a 2 will be printed. (If there is no argument toprint, it prints$0followed by a newline.)The second line has a pattern but no action; all lines that contain a 2 will be printed again. (The missing action is equivalent to
{ print }.)The third line has both a pattern and an action; all lines that both contain a 2 and also contain an ‘a’ followed by a ‘z’ will be remarked upon.
The first command will print line numbers 2, 12, 20..29, 32, 42, … 102, 112, 120..129, … 200..299, …; all lines where the line number contains a 2.
The second command will print only line number 2 because the
/^2$/constrains the value to contain start of string, digit 2 and end of string.Now I’ve looked at the YouTube resource, I think you must have misunderstood what it is trying to teach. When it talks about
{if (NR~2) print}, it should be saying it will print any line number which contains a 2; the video cites line numbers 2, 12, 20, 21, 22, etc. It should not be saying any line which contains a 2; I think the video does say that, but the video misspoke (but the text was accurate). The comparison against NR is not actually wrong, but it is aconventional; I’m not sure that I’d include regexes against NR in an introductory video describingawk. So, the video appears to have a glitch in the audio but the text on screen is accurate, I think. I may still have missed something.That command, given the input:
will print all three lines; they all contain the digit 2.
No; the pattern was empty (because there was nothing before the open brace) — so all lines match it — and the action was the part in braces
{ if ($0 ~ /2/) print }. Now, the action contains a conditional, but that’s a separate issue.Yes.