I want to split large, compressed CSV files into multiple smaller gzip file, split

Question

0

Asked: June 15, 20262026-06-15T15:04:38+00:00 2026-06-15T15:04:38+00:00

I want to split large, compressed CSV files into multiple smaller gzip file, split

0

I want to split large, compressed CSV files into multiple smaller gzip file, split on line boundary.

I’m trying to pipe gunzip to a bash script with a while read LINE. That script writes to a named pipe where a background gzip process is recompressing it. Every X characters read I close the FD and restart a new gzip process for the next split.

But in this scenario the script, with while read LINE, is consuming 90% of the cpu because read is so inefficient here (I understand that it makes a system call to read 1 char at a time).

Any thoughts on doing this efficiently? I would expect gzip to consume the majority cpu.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T15:04:39+00:00

Use split with the -l option to specify how many lines you want. Use --filter option $FILE is the name split would have used for output to file (and has to be quoted with single quotes to prevent expanding by the shell too early:

zcat doc.gz | split -l 1000 --filter='gzip > $FILE.gz'

If you need any additional processing, just pen a script, that will accept the filename as argument and process standard input accordingly, and use that instead of plain gzip.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I want to split large, compressed CSV files into multiple smaller gzip file, split

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply