Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7490417
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 29, 20262026-05-29T15:36:03+00:00 2026-05-29T15:36:03+00:00

Background – I want to extract specific columns from a csv file. The csv

  • 0

Background – I want to extract specific columns from a csv file. The csv file is comma delimited, uses double quotes as the text-qualifier (optional, but when a field contains special characters, the qualifier will be there – see example), and uses backslashes as the escape character. It is also possible for some fields to be blank.


Example Input and Desired Output – For example, I only want columns 1, 3, and 4 to be in the output file. The final extract of the columns from the csv file should match the format of the original file. No escape characters should be removed or extra quotes added and such.

Input

"John \"Super\" Doe",25,"123 ABC Street",123-456-7890,"M",A
"Jane, Mary","",132 CBS Street,333-111-5332,"F",B
"Smith \"Jr.\", Jane",35,,555-876-1233,"F",
"Lee, Jack",22,123 Sesame St,"","M",D

Desired Output

"John \"Super\" Doe","123 ABC Street",123-456-7890
"Jane, Mary",132 CBS Street,333-111-5332
"Smith \"Jr.\", Jane",,555-876-1233
"Lee, Jack",123 Sesame St,""

Preliminary Script (awk) – The following is a preliminary script I found that works for the most part, but does not work in one particular instance that I noticed and possibly more that I have not seen or thought of yet

#!/usr/xpg4/bin/awk -f

BEGIN{  OFS = FS = ","  }

/"/{
    for(i=1;i<=NF;i++){
        if($i ~ /^"[^"]+$/){
            for(x=i+1;x<=NF;x++){
                $i=$i","$x
                if($i ~ /"+$/){
                    z = x - (i + 1) + 1
                    for(y=i+1;y<=NF;y++)
                        $y = $(y + z)
                    break
                }
            }
            NF = NF - z
            i=x
        }
    }
print $1,$3,$4
}

The above seems to work well until it comes across a field that contains both escaped double quotes as well as a comma. In that case, the parsing will be off and the output will be incorrect.


Question/Comments – I have read that awk is not the best option for parsing through csv files, and perl is suggested. However, I do not know perl at all. I have found some examples of perl scripts, but they do not give the desired output I am looking for and I do not know how to edit the scripts easily for what I want.

As for awk, I am familiar with it and use the basic functionality of it occasionally, but I do not know a lot of the advanced functionality like some of the commands used in the script above. Is my desired output possible just by using awk? If so, would it be possible edit the script above to fix the issue I am having with it? Could someone explain line by line what exactly the script is doing?

Any help would be appreciated, thanks!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-29T15:36:04+00:00Added an answer on May 29, 2026 at 3:36 pm

    I’m not going to reinvent the wheel.

    use Text::CSV_XS;
    
    my $csv = Text::CSV_XS->new({
       binary      => 1,
       escape_char => '\\',
       eol         => "\n",
    });
    
    my $fh_in  = \*STDIN;
    my $fh_out = \*STDOUT;
    
    while (my $row = $csv->getline($fh_in)) {
       $csv->print($fh_out, [ @{$row}[0,2,3] ])
          or die("".$csv->error_diag());
    }
    
    $csv->eof()
       or die("".$csv->error_diag());
    

    Output:

    "John \"Super\" Doe","123 ABC Street",123-456-7890
    "Jane, Mary","132 CBS Street",333-111-5332
    "Smith \"Jr.\", Jane",,555-876-1233
    "Lee, Jack","123 Sesame St",
    

    It adds quotes around addresses that didn’t have any already, but since some addresses already have quotes around them, you obviously can handle that.


    Reinventing the wheel:

    my $field = qr/"(?:[^"\\]|\\.)*"|[^"\\,]*/s;
    while (<>) {
       my @fields = /^($field),$field,($field),($field),/
          or die;
       print(join(',', @fields), "\n");
    }
    

    Output:

    "John \"Super\" Doe","123 ABC Street",123-456-7890
    "Jane, Mary",132 CBS Street,333-111-5332
    "Smith \"Jr.\", Jane",,555-876-1233
    "Lee, Jack",123 Sesame St,""
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Background: I want to check-out the source code from Cliche , which is stored
Background: I've wrote a small library that is able to create asp.net controls from
Background My project is urgent and requires that I iterate a large XML file
Background I first wanted to upload a file via json and get a response
Background I have a pair of functions I want to use to animate some
Background/context for this question: I have a WPF desktop application. It uses LINQ to
Background I have a User Control (an .ascx file) which is being dynamically inserting
Background: writing an automated release script to export changed files between versions from SVN
Background: I have a little video playing app with a UI inspired by the
Background: At my company we are developing a bunch applications that are using the

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.