I have a log document where the string to split columns on is just the character \x01 (doesn’t map to anything real in unicode, so it’s safe). When I run the following on the phrase “This is \x01” on a CentOS box, I get:
cat ~/temp1 | sed s/\x01/meh/
this is meh
On a Mac, I get:
cat ~/temp1 | sed s/\x01/meh/
this is
Which is identical to trying to cat the original.
Alternatively, running a Perl one liner on this on a Mac as:
cat ~/temp1 | perl -e 'while ( my $line = <>) {$line =~ s/\x01/meh/g; print $line;}'
gets me:
this is meh
So, my conclusion thus far is that sed on a Mac hates unicode for some reason. Anyone have any ideas why/ how to fix it?
Use GNU sed from the MacPorts package gsed.
Edit: Documentation of GNU sed is here.