$ cat weirdo
Lunch now?
$ cat weirdo | grep Lunch
$ vi weirdo
^@L^@u^@n^@c^@h^@ ^@n^@o^@w^@?^@
I have some files that contain text with some non-printing characters like ^@ which cause my greps to fail (as above).
How can I get my grep work? Is there some way that does not require altering the files?
It looks like your file is encoded in UTF-16 rather than an 8-bit character set. The ‘^@’ is a notation for ASCII NUL ‘\0’, which usually spoils string matching.
One technique for loss-less handling of this would be to use a filter to convert UTF-16 to UTF-8, and then using
grepon the output – hypothetically, if the command was ‘utf16-utf8’, you’d write:As an appallingly crude approximation to ‘utf16-utf8’, you could consider:
This deletes ASCII NUL characters from the input file and lets
grepoperate on the ‘cleaned up’ output. In theory, it might give you false positives; in practice, it probably won’t.