I have this script that does only the parent folder, it does not rename sub folders and files – i’m hoping to get it to strip all non numeric with a _ to it.
#!/bin/bash
for f in *
do
new="${f// /_}"
if [ "$new" != "$f" ]
then
if [ -e "$new" ]
then
echo not renaming \""$f"\" because \""$new"\" already exists
else
echo moving "$f" to "$new"
mv "$f" "$new"
fi
fi
done
Getting the file and directory list:
To operate on files recursively, using
findis a better solution compared to globs. I would recommend populating a bash array with the file names before you start operating on them. Also I think doing this one step at a time, directories then files, would be prudent. You don’t want to rename a directory and then later while renaming a file find that the file does not exist. It is also important that the script works on progressively deeper levels on the filesystem hierarchy for the same reason (hence the use ofsortbelow).Operating on the lists:
Once you have the list, you can call on a common shell function to do the file or directory name normalising and renaming. Please note the importance of quoting the names properly to get what you want. This is extremely important since bash (or any shell for that matter) uses spaces as word boundary while parsing the command line.
The script:
The following script (named
./rename_spaces.bashin the example output below) should do what you want. To add your own weird characters, add them to theweirdcharsvariable. Note that you need to escape the characters as appropriate (e.g. the single quote has been escaped). The script skips over with a message if the new file name exists. This also means it will print the message for trivial renames (file names that did not have weird characters in their original names). This can be annoying to some people (e.g. me :-p)Here is a sample output with a directory tree with directories and files with spaces in there names before and after running the above script.
Note: Implementing the script to “do the right thing” for anything non-alphanumeric seems to be non-trivial. For example, I am not sure how to deal with dots or pre-existing underscores or other “regular” allowed characters in the file or directory names.
Identifying undesirable/special characters in a generic way is also a problem. It is even more complicated in international language environments. I do not know any easy way of saying allow “only numerals or characters from the English alphabet”. If anyone has ideas, please go ahead and post an answer.