This script correctly finds the files I need and replaces consecutive spaces with a single space:
find . -type f -iname *-[0-9][0-9][0-9][0-9][0-9]-* ! -iname *.gz ! -iname *_processed -print0 | xargs -0 sed -i 's/ \+ / /g'
What I need now is to append _parsed to the end of each file’s filename so that the files are ignored the next time this script is run.
What is a good way to do this? Note: the files have no extension. The filenames look like this:
./1923/338810-99999-1923
./1921/999999-41406-1921
./1953/320590-99999-1953
./1911/241360-99999-1911
./1923/307330-99999-1923
./1983/802220-99999-1983
Edit: I am using CentOS 6. Python-based solutions would work as well.
If you’re looking for a way to combine your current script with the ability to append the string, you can put the results of your
findinto awhileloop and do both at the same time (whileinstead offorto support files with spaces, if you ever need this condition – thanks to @TimPote for the tip!):An alternative, just to rename, would be to use
find‘s-execoption:This command will iterate through the same list of files that your original find+replace command finds, but this time will just rename them as you wanted.