Working in bash under Linux ubuntu 10 here
I have Bash script that reads lines from gedit-created .txt file and pushes then into an array. Works as expected.
However when my input is a .txt generated from Excel it throws the error
")syntax error: invalid arithmetic operator (error token is "
echo -n $elem | od -x yields
0000000 3533 0d32
0000004
I can’t help feeling I’m almost at the solution, but it is eluding me, much to my frustration. I’d appreciate some help
Thanks
@MarcB
file from excel: (cut’n’pasted from gedit; contrary to appearance there are no blank lines in this file; rather lines alternate int, str, int, str …)
0
A?GATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATC
1
A?GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGAT[AC]T
2
A?GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTAT[AC]T
3
A?GATCGGAAGAGCACACGTCTGAACTCCAGTCACTTAGGCAT[AC]T
4
A?GATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCAAT[AC]T
5
A?GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGAT[AC]T
6
A?GATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATAT[AC]T
7
A?GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCAT[AC]T
8
A?GATCGGAAGAGCACACGTCTGAACTCCAGTCACACTTGAAT[AC]T
9
A?GATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGAT[AC]T
10
A?GATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTAT[AC]T
rtstxt='readthrusequences.txt'
# establish readthrusequence array ---------------------------------------------
# push into sparse array the readthru adapter sequence for each TruSeq index
# use the TruSeq Index number as key to the sequence
forts=${pathsir}${rtstxt} # FileOf ReadThruSequences
rts=( $(cat ${forts}) )
idx=""
elem=""
isIdx=1
for elem in ${rts[@]}; do
echo '$elem:'${elem}
# echo 'elem:' ${elem} 'before IF - isIdx:' $isIdx '- idx:' $idx
if [[ $isIdx = 1 ]]; then
echo ' 1_block - $isIdx:'$isIdx' - elem:'$elem' - idx:'$idx;
indexseq[$elem]=0;
#echo " indexseq[elem] set to ${indexseq[$elem]}";
idx=$elem;
#echo " idx set to elem (i.e. $idx)";
isIdx=0;
#echo " isIdx reset to $isIdx";
#echo " " ;
else
#echo " 2_block - isIdx:$isIdx - elem:$elem - idx:$idx";
indexseq[$idx]=$elem;
#echo " indexseq[idx] set to ${indexseq[$idx]}";
isIdx="1"; idx="0";
#echo " isIdx reset to $isIdx - idx reset to $idx";
#echo "";
fi
# echo "keys (TruSeq index): ${!indexseq[*]}"
# echo "vals (indexed adapter seq): ${indexseq[*]}"
done
This code pushes the file contents into an array, using the int as indices and the str and values.
The commented lines were debugs. If the first is un-commented the console yields
before IF - isIdx: 1 - idx:
- idx:k - $isIdx:1 - elem:0
")syntax error: invalid arithmetic operator (error token is "
pointing definitely to a line end issue; but I’ve banged my head on this wall for too long & haven’t found a solution. I know there’s a simple one …
An excel generated file almost certainly uses the \r\n character pair to terminate each line. (There may be a Ctrl-Z char at the end of the file). Unix based systems expect only the \n character to terminate a line of input (and Ctrl-D (not usually) as end-of-file marker).
Solution, either edit file to remove ^M chars (\r) at the end of each line (also check for ^Z at end of file and remove that too), OR the standard is
I hope this helps.