I have a file dict containing one integer on each row 123 456 I

Question

0

Asked: June 10, 20262026-06-10T00:59:37+00:00 2026-06-10T00:59:37+00:00

I have a file dict containing one integer on each row 123 456 I

0

I have a file dict containing one integer on each row

123
456

I want to find lines in file file that contain exactly the integers in dict.

If I use

$ grep -w -f dict file

I get false matches such as

12345  foo
23456  bar

These are false because 12345 != 123 and 23456 != 456. The problem is that the -w option considers digits as word characters too. The -x option will not work either as the lines in file can have other text. What’s the best way to do this please? It will be great if the solution can offer progress monitoring and a good performance on dict and file of large sizes.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T00:59:39+00:00

A fairly general method using awk:

awk 'FNR==NR { array[$1]++; next } { for (i=1; i<=NF; i++) if ($i in array) print $0 }' dict file

Explanation:

FNR==NR { }  ## FNR is number of records relative to the current input file. 
             ## NR is the total number of records.
             ## So this statement simply means `while we're reading the 1st file
             ## called dict; do ...`

array[$1]++; ## Add the first column ($1) to an array called `array`.
             ## I could use $0 (the whole line) here, but since you have said
             ## that there will only be one integer per line, I decided to use
             ## $1 (it strips leading and lagging whitespace; if any)

next         ## process the next line in `dict`

for (i=1; i<=NF; i++)  ## loop through each column in `file`

if ($i in array)       ## if one of these columns can be found in the array

print $0               ## print the whole line out

To process multiple files using bash loop:

## This will process files; like file, file1, file2, file3 ...
## And create output files like, file.out, file1.out, file2.out, file3.out ...

for j in file*; do awk -v FILE=$j.out 'FNR==NR { array[$1]++; next } { for (i=1; i<=NF; i++) if ($i in array) print $0 > FILE }' dict $j; done

If you are interested in using tee on multiple files, you may like to try something like this:

for j in file*; do awk -v FILE=$j.out 'FNR==NR { array[$1]++; next } { for (i=1; i<=NF; i++) if ($i in array) { print $0 > FILE; print FILENAME, $0 } }' dict $j; done 2>&1 | tee output

This will show you the name of the file being process and the matching record found, and write a ‘log’ to file called output.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a file dict containing one integer on each row 123 456 I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply