I need to print out only one of various consecutive lines with same first

Question

0

Asked: May 30, 20262026-05-30T00:15:55+00:00 2026-05-30T00:15:55+00:00

I need to print out only one of various consecutive lines with same first

0

I need to print out only one of various consecutive lines with same first field, and the one must be the one with “more fields in its last field”. That means that last field is a set of words, and I need to print the line with more elements in its last field. In case of same number of max elements in last field, any of the max is ok.

Example input:

("aborrecimento",[Noun],[Masc],[Reg:Sing],[Bulk])
("aborrecimento",[Noun],[Masc],[Reg:Sing],[Device,Concrete,Count])
("aborrecimento",[Noun],[Masc],[Reg:Sing],[])
("adiamento",[Noun],[Masc],[Reg:Sing],[])
("adiamento",[Noun],[Masc],[Reg:Sing],[Count])
("adiamento",[Noun],[Masc],[Reg:Sing],[VerbNom])

Example output:

("aborrecimento",[Noun],[Masc],[Reg:Sing],[Device,Concrete,Count])
("adiamento",[Noun],[Masc],[Reg:Sing],[VerbNom])

solution with awk would be nice, but no need of one liner.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-30T00:15:57+00:00

generate index file

$ cat input.txt |
sed 's/,\[/|[/g' | 
awk -F'|' '
{if(!gensub(/[[\])]/, "", "g", $NF))n=0;else n=split($NF, a, /,/); print NR,$1,n}
' | 
sort -k2,2 -k3,3nr | 
awk '$2!=x{x=$2;print $1}' >idx.txt

content of index file

$ cat idx.txt
2
5

select lines

$ awk 'NR==FNR{idx[$0]; next}; (FNR in idx)' idx.txt input.txt
("aborrecimento",[Noun],[Masc],[Reg:Sing],[Device,Concrete,Count])
("adiamento",[Noun],[Masc],[Reg:Sing],[Count])

Note: no space in input.txt

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to print out only one of various consecutive lines with same first

Leave an answerCancel reply

1 Answer

generate index file

content of index file

select lines

Leave an answer
Cancel reply