I’ve got a file called ‘res’ that’s 29374 characters of http data in a

Question

0

Asked: May 20, 20262026-05-20T07:32:57+00:00 2026-05-20T07:32:57+00:00

I’ve got a file called ‘res’ that’s 29374 characters of http data in a

0

I’ve got a file called ‘res’ that’s 29374 characters of http data in a one-line string. Inside it, there are several http links, but I only want to be display those that end in ‘/idNNNNNNNNN’ where N is a digit. In fact I’m only interested in the string ‘idNNNNNNNNN’.
I’ve tried with:

cat res | sed -n '0,/.*\(id[0-9]*\).*/s//\1/p'

but I get the whole file.
Do you know a way to do it?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-20T07:32:57+00:00

perl -n -E 'say $1 while m!/id(\d{9})!g' input-file

should work. That assumes exactly 9 digits; that’s the {9} in the above. You can match 8 or 9 ({8,9}), 8 or more ({8,}), up to 9 ({0,9}), etc.

Example of this working:

$ echo -n 'junk jumk http://foo/id231313 junk lalala http://bar/id23123 asda' | perl -n -E 'say $1 while m!id(\d{0,9})!g'
231313
23123

That’s with the 0 to 9 variant, of course.

If you’re stuck with a pre-5.10 perl, use -e instead of -E and print "$1\n" instead of say $1.

How it works

First is the two command-line arguments to Perl. -n tells Perl to read input from standard input or files given on the command line, line by line, setting $_ to each line. $_ is perl’s default target for a lot of things, including regular expression matches. -E merely tells Perl that the next argument is a Perl one-liner, using the new language features (vs. -e which does not use the 5.10 extensions).

So, looking at the one liner: say means to print out some value, followed by a newline. $1 is the first regular expression capture (captures are made by parentheses in regular expressions). while is a looping construct, which you’re probably familiar with. m is the match operator, the ! after it is the regular expression delimiter (normally, you see / here, but since the pattern contains / it’s easier to use something else, so you don’t have to escape the / as \/). /id(\d{9}) is the regular expression to match. Keep in mind that the delimiter is !, so the / is not special, it just matches a literal /. The parentheses form a capture group, so $1 will be the number. The ! is the delimiter, followed by g which means to match as many times as possible (as opposed to once). This is what makes it pick up all the URLs in the line, not just the first. As long as there is a match, the m operator will return a true value, so the loop will continue (and run that say $1, printing out the match).

Two-sed solution

I think this is one way to do this with only sed. Much more complicated!

echo 'junk jumk http://foo/id231313 junk lalala http://bar/id23123 asda' | \
    sed 's!http://!\nhttp://!g' | \
    sed 's!^.*/id\([0-9]*\).*$!\1!'

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve got a file called ‘res’ that’s 29374 characters of http data in a

Leave an answerCancel reply

1 Answer

How it works

Two-sed solution

Leave an answer
Cancel reply