I have a script that records files with UTF8 encoded names. However the script’s

Question

0

Asked: May 12, 20262026-05-12T12:52:53+00:00 2026-05-12T12:52:53+00:00

I have a script that records files with UTF8 encoded names. However the script’s

0

I have a script that records files with UTF8 encoded names. However the script’s encoding / environment wasn’t set up right, and it just recoded the raw bytes. I now have lots of lines in the file like this:

.../My\ Folders/My\ r\303\266m/...

So there are spaces in the filenames with \ and UTF8 encoded stuff like \303\266 (which is ö). I want to reverse this encoding? Is there some easy set of bash command line commands I can chain together to remove them?

I could get millions of sed commands but that’d take ages to list all the non-ASCII characters we have. Or start parsing it in python. But I’m hoping there’s some trick I can do.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-12T12:52:53+00:00

Editorial Team

2026-05-12T12:52:53+00:00Added an answer on May 12, 2026 at 12:52 pm

In the end I used something like this:

cat file | sed 's/%/%%/g' | while read -r line ; do printf "${line}\n" ; done | sed 's/\\ / /g'

Some of the files had % in them, which is a printf special character, so I had to ‘double it up’ so that it would be escaped and passed straight through. The -r in read stops read escaping the \‘s however read doesn’t turn "\ " into " ", so I needed the final sed.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a script that records files with UTF8 encoded names. However the script’s

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply