Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 977977
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T03:59:24+00:00 2026-05-16T03:59:24+00:00

In a recent question it was noted that on OSX running sed on a

  • 0

In a recent question it was noted that on OSX running sed on a non ascii file gave strange results. For instance if you do (/usr/bin/cal is a random binary file)

sed 's/[^A-Z]//' /usr/bin/cal

sed will remove all of the printable characters other than A-Z, but many nonprintable characters remain. If however, you do

LANG='' sed 's/[^A-Z]//' /usr/bin/cal

only A-Z (and newlines) are output. Why?

Normally LANG=en-US.UTF-8 What is going on? I cannot see anyway that the output of sed could be considered correct in UTF-8. Is it broken, or is there some notion of working that I do not understand?

I know that the OSX sed is conforming to POSIX, and is therefore different from the beloved GNU sed.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T03:59:25+00:00Added an answer on May 16, 2026 at 3:59 am

    Binary data, such as the contents of /usr/bin/cal, are not UTF-8, and so will confuse any code that reads it as if it was. In particular, any byte with the high bit set (e.g., >= 128) will be interpreted as part of a multi-byte sequence representing a single character, and will thus be elided from the output. Not all sequences of bytes with the high-bit set are valid UTF-8, so things will get quite confused, but this probably explains why some non-printable characters remain but (possibly) not others.

    In short: if you want to use text-oriented tools on binary data, don’t.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

A recent question here made use of the default keyword in non-generic code that
A recent question about string literals in .NET caught my eye. I know that
A recent question contains a problem that I many times used to think about
In a recent question of mine I learned that if there are more than
In a recent interview, I was asked a really strange question. The interviewer asked
As per this recent question , HTML 4 standard indicates that closing </TD> /
A recent question came up about using String.Format(). Part of my answer included a
A recent question about StyleCop alerted me to the use of tools to enforce
This recent question about sorting randomly using C# got me thinking about the way
Following on from my recent question on Large, Complex Objects as a Web Service

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.