I have a shell script with the following line to remove double quotes from

Question

0

Asked: June 10, 20262026-06-10T00:52:01+00:00 2026-06-10T00:52:01+00:00

I have a shell script with the following line to remove double quotes from

0

I have a shell script with the following line to remove double quotes ” from a text file.

sed 's/\"//g' old_file.txt > new_file.txt

There is one more awk statement that selects only a specific columns from a ^ separated text file.

Both the statements are working as expected. But the server hangs when the input file is more than a few GB in size. I will like to know if python can do the same more efficiently.

update:

It is not stopping the server, but mysql hosted on the same server is slow when I run the shell script.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-10T00:52:02+00:00

It’s unlikely that Python could do that faster. With a bit work, it could do the same thing with +/- same efficiency. Unless you attempt to do it wrong; because then it will be slower.

Both sed & awk operate in line mode. They are quite I/O-optimized, and I don’t think you could improve over that. The Python script may be faster if it comes to performing operations but in this case it’s very unlikely to be relevant.

Just pipe them like @paxdiablo suggests:

sed 's/"//g' old_file.txt | awk '...' > new_file.txt

Or, if the column format is simple enough, you can replace awk with simpler cut which would be faster:

sed 's/"//g' old_file.txt | cut -d' ' -f1-2,4 > new_file.txt

(example for columns 1, 2 & 4, space-separated)

And if you need the intermediate output, you can put tee in the pipeline to write it in the meantime:

sed 's/"//g' old_file.txt | tee inter_file.txt | cut -d' ' -f1-2,4 > new_file.txt

But it may be actually less efficient since both inter_file.txt and new_file.txt will be written at the same time.

Ok, now I think I understand what the problem is. Your problem is not that the script is not fast enough because it gets as fast as it can get. It’s your hard drive which hits it throughput limit and thus other applications using it get delayed. You could say that it is simply too fast for you hard drive.

One solution is to try using ionice to give it lower priority. It may help, it may not make a difference at all.

ionice -c3 -p$$

gives the lowest (idle) I/O priority to the current shell or script. Similarly, you can start your script with given priority using:

ionice -c3 ./yourscript.sh

The results may vary upon I/O scheduler used. Some schedulers will ignore this, some may actually make the script slower (whenever mysql will be requesting I/O).

Alternatively, you could use an additional program which will limit the throughput going to sed, and effectively making it slower and giving some free space for mysql to fit in. You will, however, need to measure what throughput is optimal for you.

And finally, if none of the above is an option, you could jump in to Python, and add time.sleep() every few hundred or thousand lines to stop the script for a while to let mysql do its job.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a shell script with the following line to remove double quotes from

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply