Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9111707
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 17, 20262026-06-17T03:34:11+00:00 2026-06-17T03:34:11+00:00

I am trying to replace values in a large space-delimited text-file and could not

  • 0

I am trying to replace values in a large space-delimited text-file and could not find a suitable answer for this specific problem:

Say I have a file “OLD_FILE”, containing a header and approximately 2 million rows:

COL1 COL2 COL3 COL4 COL5
rs10 7 92221824 C A 
rs1000000 12 125456933 G A 
rs10000010 4 21227772 T C 
rs10000012 4 1347325 G C 
rs10000013 4 36901464 C A 
rs10000017 4 84997149 T C 
rs1000002 3 185118462 T C 
rs10000023 4 95952929 T G 
...

I want to replace the first value of each row with a corresponding value, using a large (2.8M rows) conversion table. In this conversion table, the first column lists the value I want to have replaced, and the second column lists the corresponding new values:

COL1_b36       COL2_b37
rs10    7_92383888
rs1000000       12_126890980
rs10000010      4_21618674
rs10000012      4_1357325
rs10000013      4_37225069
rs10000017      4_84778125
rs1000002       3_183635768
rs10000023      4_95733906
...

The desired output would be a file where all values in the first column have been changed according to the conversion table:

COL1 COL2 COL3 COL4 COL5
7_92383888 7 92221824 C A 
12_126890980 12 125456933 G A 
4_21618674 4 21227772 T C 
4_1357325 4 1347325 G C 
4_37225069 4 36901464 C A 
4_84778125 4 84997149 T C 
3_183635768 3 185118462 T C 
4_95733906 4 95952929 T G 
...

Additional info:

  • Performance is an issue (the following command takes approximately a year:

    while read a b; do sed -i “s/\b$a\b/$b/g” OLD_FILE ; done < CONVERSION_TABLE

  • A complete match is necessary before replacing
  • Not every value in the OLD_FILE can be found in the conversion table…
  • …but every value that could be replaced, can be found in the conversion table.

Any help is very much appreciated.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-17T03:34:12+00:00Added an answer on June 17, 2026 at 3:34 am

    Here’s one way using awk:

    awk 'NR==1 { next } FNR==NR { a[$1]=$2; next } $1 in a { $1=a[$1] }1' TABLE OLD_FILE
    

    Results:

    COL1 COL2 COL3 COL4 COL5
    7_92383888 7 92221824 C A
    12_126890980 12 125456933 G A
    4_21618674 4 21227772 T C
    4_1357325 4 1347325 G C
    4_37225069 4 36901464 C A
    4_84778125 4 84997149 T C
    3_183635768 3 185118462 T C
    4_95733906 4 95952929 T G
    

    Explanation, in order of appearance:

    NR==1 { next }            # simply skip processing the first line (header) of
                              # the first file in the arguments list (TABLE)
    
    FNR==NR { ... }           # This is a construct that only returns true for the
                              # first file in the arguments list (TABLE)
    
    a[$1]=$2                  # So when we loop through the TABLE file, we add the
                              # column one to an associative array, and we assign
                              # this key the value of column two
    
    next                      # This simply skips processing the remainder of the
                              # code by forcing awk to read the next line of input
    
    $1 in a { ... }           # Now when awk has finished processing the TABLE file,
                              # it will begin reading the second file in the
                              # arguments list which is OLD_FILE. So this construct
                              # is a condition that returns true literally if column
                              # one exists in the array
    
    $1=a[$1]                  # re-assign column one's value to be the value held
                              # in the array
    
    1                         # The 1 on the end simply enables default printing. It
                              # would be like saying: $1 in a { $1=a[$1]; print $0 }'
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have been trying to replace a word in a text file with a
I'm trying to replace text in a source file much like how the C
I am trying to replace the $3 column values of the input file if
I'm trying to parse CSS with Regex and replace specific values in PHP. Currently
I have a text file and I am trying to replace certain lines with
I'm trying to replace all commas with a comma and a space. This is
I am trying to replace a large string in groovy. But can't get it
I'm new to string manipulation and just trying to replace values in a list.
I'm going crazy, spent a couple of hours trying different methods in replace values
I am trying to replace true and false values with checkboxes in QSqlTableModel for

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.