Normally, I do something like IFS=’,’ columns=( $LINE ) where $LINE is a line

Question

0

Asked: June 16, 20262026-06-16T07:34:17+00:00 2026-06-16T07:34:17+00:00

Normally, I do something like IFS=’,’ columns=( $LINE ) where $LINE is a line

0

Normally, I do something like

IFS=','
columns=( $LINE )

where $LINE is a line from a csv file I’m reading.

However, how do I handle a csv file with embedded commas? I have to handle several hundred gigs of file so everything needs to be done quickly, i.e., no multiple readings of a line, definitely no loops (last time I tried that slowed it down several factors).

The general structure of the code is as follows

FILENAME=$1
cat $FILENAME | while read LINE
do
    IFS=","
    columns=( $LINE )
    # affect columns changes here
    newline="${columns[*]}"
    echo "$newline"
done

Preferably, I need something that goes

FILENAME=$1
cat $FILENAME | while read LINE
do
    IFS=","
    # code to tell bash to ignore if IFS is within an open quote
    columns=( $LINE )
    # affect columns changes here
    newline="${columns[*]}"
    echo "$newline"
done

Any tips would be appreciated. Otherwise, I’ll probably switch to using another language to handle this stuff.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-16T07:34:19+00:00

Probably embedded commas is just the first obvious problem that you encountered while parsing those CSV files.

Future problems that might popped are:

embedded newline separator characters
embedded utf8 chars
special treatment for whitespaces, empty fields, spaces around commas, undef values

I generally tend to follow the philosophy that If there is a (reputable) module that parses some
format you have to parse, use it instead of making a homebrew

I don’t think there is such a thing for bash, but there are some for Perl. I’d go for Text::CSV_XS. Being written in C I expect it to be very fast.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Normally, I do something like IFS=’,’ columns=( $LINE ) where $LINE is a line

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply