I spent a whole day trying to process some files with backslashes and spaces inside their names. No matter what I do awk (gawk) refuses to print backslashes:
echo "this/pathname/contains/spa ces/and/back\\slashes" | xargs -d'\n' -n1 -I{} bash -c 'echo "{}"; echo whatever | gawk "{printf {}}"'
this/pathname/contains/spa ces/and/back\slashes
gawk: {printf this/pathname/contains/spa ces/and/back\slashes}
gawk: ^ syntax error
gawk: {printf this/pathname/contains/spa ces/and/back\slashes}
gawk: ^ backslash not last character on line
This didn’t work since the backspace gets directly into awk code.
echo "this/pathname/contains/spa ces/and/back\\slashes" | xargs -d'\n' -n1 -I{} bash -c 'echo "{}"; echo whatever | gawk "{printf \"{}\"}"'
this/pathname/contains/spa ces/and/back\slashes
gawk: warning: escape sequence `\s' treated as plain `s'
this/pathname/contains/spa ces/and/backslashes
This worked, but awk eats the backslash. As you can see above, echo prints it but awk doesn’t.
echo "this/pathname/contains/spa ces/and/back\\slashes" | ./escape.sh | xargs -d'\n' -n1 -I{} bash -c 'echo "{}"; echo whatever | gawk "{printf \"{}\"}"'
this/pathname/contains/spa\ ces/and/back\slashes
gawk: warning: escape sequence `\ ' treated as plain ` '
gawk: warning: escape sequence `\s' treated as plain `s'
Next I tried escaping the filenames using escape.sh
#!/bin/bash
xargs -d'\n' -n1 -I{} bash -c 'echo $(printf "%q" "{}")'
Now there’s a double backslash in there but awk still complains.
echo "this/pathname/contains/spa ces/and/back\\slashes" | ./escape.sh | xargs -d'\n' -n1 -I{} bash -c 'echo "{}"; echo whatever | gawk -v VAR=$(printf "%q" "{}") "{printf VAR}"'
this/pathname/contains/spa\ ces/and/back\slashes
gawk: ces/and/back\\slashes
gawk: ^ syntax error
gawk: ces/and/back\\slashes
gawk: ^ unterminated regexp
Now awk said some nonsense about some unterminated regexp.
Any ideas? Thanks!
The fix is just to double every backslash that is fed into mawk, either in the input or via variables.
Like this:
Therefore, if a filename containing backslashes is passed inside $1, this is how you obtain the expected result.