I’ve written a bash script to run a series of commands, culminating in a file called DataAudit.txt. It works great… if the file I am working with happens to be called file.csv.
I’m very new to all of this and not sure how to write the script so it can work on whichever file I want to audit.
The script, called audit.sh, lives in a folder called PurgatoryCSV and the idea is that I would drop a file in there, run the script, and move the file to the next step in my workflow.
I would be grateful for any help I could get with this roadblock.
Here is the script:
#!/bin/bash
echo -n "DATA AUDIT
------------
COLUMN NAMES
------------
" > DataAudit.txt
csvcut -n file.csv >> DataAudit.txt
echo -n "
---------------------------------------
FIRST TEN ROWS OF FIRST FIVE COLUMNS
---------------------------------------
" >> DataAudit.txt
csvcut -c 1,2,3,4,5 file.csv | head -n 10 >> DataAudit.txt
echo -n "
------------
COLUMN STATS
------------
" >> DataAudit.txt
csvcut file.csv | csvstat >> DataAudit.txt
echo -n "
---END AUDIT" >> DataAudit.txt
You can use variables that are passed in from the command line:
$1for the first,$2for the second, etc. It looks like you have two variables here, thefile.csvandDataAudit.txtIf you replace
file.csvwith$1andDataAudit.txtwith$2, you can now execute your script by doing:Alternatively for more readability, it is common to assign these into named variables at the top of your script:
Then, in your code you can reference these with
$INPUTFILEand$OUTPUTFILE