I had a need to convert uTorrent-style ipfilter.dat into a bluetack-style ipfilter file, and wrote this shell script to achieve this:
#!/bin/bash
# read ipfilter.dat-formatted file line by line
# (example: 000.000.000.000-008.008.003.255,000,Badnet
# - ***here, input file's lines/fields are always the same length***)
# and convert into a bluetack.co.uk-formatted output
# (example: Badnet:0.0.0.0-8.8.3.255
# - fields moved around, leading zeros removed)
while read record
do
start=`echo ${record:0:15} | awk -F '.' '{for(i=1;i<=NF;i++)$i=$i+0;}1' OFS='.'`
end=`echo ${record:16:15} | awk -F '.' '{for(i=1;i<=NF;i++)$i=$i+0;}1' OFS='.'`
echo ${record:36:7}:${start}-${end}
done < $1
However, on a 2000-line input file this script takes on average 10(!) seconds to complete – a mere 200 lines/sec.
I’m sure this same result can be achieved with sed, and sed-version is likely to be much faster.
Is there a sed-guru around to suggest a solution for this kind of fixed-positions replacements?
Feel free to suggest a solution in other languages as well – I would enjoy testing a Python or a C version, for example. A more efficient shell/bash version would be welcome as well.
You could try this.
I didn’t test the performance but I guess it could be faster than 200 lines/sec.