I’ve got a file called ‘res’ that’s 29374 characters of http data in a one-line string. Inside it, there are several http links, but I only want to be display those that end in ‘/idNNNNNNNNN’ where N is a digit. In fact I’m only interested in the string ‘idNNNNNNNNN’.
I’ve tried with:
cat res | sed -n '0,/.*\(id[0-9]*\).*/s//\1/p'
but I get the whole file.
Do you know a way to do it?
should work. That assumes exactly 9 digits; that’s the
{9}in the above. You can match 8 or 9 ({8,9}), 8 or more ({8,}), up to 9 ({0,9}), etc.Example of this working:
That’s with the 0 to 9 variant, of course.
If you’re stuck with a pre-5.10 perl, use
-einstead of-Eandprint "$1\n"instead ofsay $1.How it works
First is the two command-line arguments to Perl.
-ntells Perl to read input from standard input or files given on the command line, line by line, setting$_to each line.$_is perl’s default target for a lot of things, including regular expression matches.-Emerely tells Perl that the next argument is a Perl one-liner, using the new language features (vs.-ewhich does not use the 5.10 extensions).So, looking at the one liner:
saymeans to print out some value, followed by a newline.$1is the first regular expression capture (captures are made by parentheses in regular expressions).whileis a looping construct, which you’re probably familiar with.mis the match operator, the!after it is the regular expression delimiter (normally, you see/here, but since the pattern contains/it’s easier to use something else, so you don’t have to escape the/as\/)./id(\d{9})is the regular expression to match. Keep in mind that the delimiter is!, so the/is not special, it just matches a literal/. The parentheses form a capture group, so$1will be the number. The!is the delimiter, followed bygwhich means to match as many times as possible (as opposed to once). This is what makes it pick up all the URLs in the line, not just the first. As long as there is a match, themoperator will return a true value, so the loop will continue (and run thatsay $1, printing out the match).Two-sed solution
I think this is one way to do this with only sed. Much more complicated!