I am using linux and bash. I have a text file with the context generated in run time by other program. The length, number of lines and content of the text file changed from time to time. But there is some pattern unchanged in the text, one typical example is
123098230984LD#2e3
123098230984LD#23234
XER_3424324_23424
33: 34: 35: node:9-72-1408 &82 &34
$1231313
*3435322
link to port:323
3424242424LD#2234
332424LD#23424234
Here, I want to extract the pattern “node:NUMBER-NUMBER-NUMBER” and “port:NUMBER” but where it occurs in the text varied from time to time too. Now I manually extract the information. I am wondering if there is any way to extract the information automatically. What make it really difficult is the content change every time when the file generated.
You can use
sedto extract the desired fields by getting rid of the undesired bits:The
.*bits simply represent any junk and the parentheses are used to “capture” the matching text so it can be used in the replacement (as\1and\2).Sidebar:
If your version of
seddoesn’t support-Efor extended regexes, it may support-r, as with certain versions of GNUsed.Otherwise, you’ll need to escape the parentheses and
+characters:The source code for GNU sed contains this little snippet:
but this appears to have been introduced in 4.2 (i.e., it’s in 4.2 but not in 4.1.5, the last of the 4.1 series). See here for details.
And, if you need the actual values in variables, you can use something like:
(taking into account the earlier comments about using
-ror adding extra escaping for “lesser”sedimplementations).