I’ve got a file (possibly binary) that contains mostly non-printable ASCII characters as the

Question

0

Asked: May 31, 20262026-05-31T20:02:29+00:00 2026-05-31T20:02:29+00:00

I’ve got a file (possibly binary) that contains mostly non-printable ASCII characters as the

0

I’ve got a file (possibly binary) that contains mostly non-printable ASCII characters as the output of the octal dump utility, below, shows.

od  -a MyFile.log 
0000000  cr  nl esc   a soh nul esc   * soh   L soh nul nul nul nul nul
0000020 nul soh etx etx etx soh nul nul nul nul nul nul nul nul nul nul
0000040 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
*
0000100 nul nul nul nul nul soh etx etx etx nul nul nul nul nul nul nul
0000120 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
0000140 nul nul nul nul nul nul nul nul soh etx etx etx soh nul nul nul
0000160 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
0000200 nul nul nul nul nul nul nul nul nul nul nul soh etx etx etx etx
0000220 etx soh etx etx etx etx etx etx etx soh etx etx etx etx etx etx
0000240 etx soh etx etx etx etx etx soh soh soh soh soh nul nul nul nul
0000260 nul nul nul nul nul nul nul nul nul nul nul nul nul nul etx etx
0000300 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul

I’d like to do the following:

Parse or break the file into paragraph-like sections that start with either of the characters esc, fs, gs and us (ASCII numbers 27, 28, 29 and 31).
Have the output file contain human-readable ASCII characters like octal dump.
Store the result in a file.

What would be the best way of doing this? I’d prefer to use UNIX/Linux shell utilities, e.g. grep, to perform this task instead of a C program.

Thanks.

Edit I’ve used the octal dump utility command od -A n -a -v MyFile.log in order to remove the offsets from the file as follows:

  cr  nl esc   a soh nul esc   * soh   L soh nul nul nul nul nul
 nul soh etx etx etx soh nul nul nul nul nul nul nul nul nul nul
 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
 nul nul nul nul nul soh etx etx etx nul nul nul nul nul nul nul
 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
 nul nul nul nul nul nul nul nul soh etx etx etx soh nul nul nul
 nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul nul
 nul nul nul nul nul nul nul nul nul nul nul soh etx etx etx etx
 etx soh etx etx etx etx etx etx etx soh etx etx etx etx etx etx
 etx soh etx etx etx etx etx soh soh soh soh soh nul nul nul nul
 nul nul nul nul nul nul nul nul nul nul nul nul nul nul etx etx

I’d like to proceed from or perhaps pipe this file to some other utility e.g. awk.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T20:02:30+00:00

od -a -An -v file | perl -0777ne 's/\n//g,print "$_\n " for /(?:esc| fs| gs| us)?(?:(?!esc| fs| gs| us).)*/gs'

od -a -An -v file → octal dump of file with named characters (-a), no addresses (-An), and no suppressed duplicate lines (-v).
-0777 → slurp whole file (the line separator is the non-existent 0777 character).
-n → use an implicit loop to read input (the whole 1 line).
for /(?:esc| fs| gs| us)?(?:(?!esc| fs| gs| us).)*/gs → for every (/g) section that optionally begins in esc, fs, gs or us, and contains a maximal sequence of characters (including newline: /s) without esc, fs, gs or us.
s/\n//g → remove newlines from od
print "$_\n " → print the section and a newline (and a space to match od‘s formatting)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve got a file (possibly binary) that contains mostly non-printable ASCII characters as the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply