(I use BSD Sed.)
This bash script:
sed -E -f parsefile < parsewords.d
With this command file:
# Delete everything before BEGIN RTL and after END RTL
\?/\* BEGIN RTL \*/?,\?/\* END RTL \*/?!d
# Delete comments unless they begin with /*!
s?/\*[^!].*\*/??g
# Delete blank lines
/^[ ]*$/d
# Break line into words
s/[^A-Za-z0-9_]+/ /g
# Remove leading and trailing spaces and tabs
s/^[ ]*(.*)[ ]*$/\1/
With this input file:
any stuff
/* BEGIN RTL */
/*! INPUTS: a b c d ph1 */ /* Comment */
x = a && b || c && d;
y = x ? a : b; /* hello */
z = ph1 ? x : z;
w = c || x || (z || d);
/* END RTL */
Produces this result:
INPUTS a b c d ph1
x a b c d
y x a b
z ph1 x z
w c x z d
That’s fine so far but what I’d really like to have is something like this:
x = a && b || c && d; x a b c d
y = x ? a : b; y x a b
z = ph1 ? x : z; z ph1 x z
w = c || x || (z || d); w c x z d
so that the original line is retained along with the mods that the script is making.
Is this possible with sed or should I use something else. (Any other comments are welcome too.)
EDIT: This is not a parsing question. It is about retaining the original input line along with sed modifications.
A solution using ‘sed’.
Input file (infile):
‘Sed’ program (script.sed):
Execution:
Output (I don’t understand the line with the ‘INPUTS’ word, but change the script to adapt it):