Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8586485
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 11, 20262026-06-11T22:19:15+00:00 2026-06-11T22:19:15+00:00

I am trying to convert one of my Perl script to R script. I

  • 0

I am trying to convert one of my Perl script to R script. I have a dataframe in R which looks like (Ignore the column names)-

CHR     START          END      TYPE
chr1    945493         945593   normal
chr1    945593        947374    normal
chr1    947374        947474    normal
chr1    947474        947574    gain
chr1    947574        947674    gain
chr1    947674        960364    gain
chr1    960364        960464    normal
chr22   17290491    17290591    normal
chr22   17290591    17290691    normal
chr22   17290691    17290791    gain
chr22   17290791    17292513    gain
chr22   17292513    17292613    gain
chr22   17292613    17292713    gain
chr22   17292713    17293046    gain
chr22   17293346    17298475    gain
chr22   17298475    17298575    gain
chr22   17298575    17298675    normal
chr22   17298675    17303632    normal
chr22   17303632    17303732    loss
chr22   17303732    17303832    normal
chrX    154162621   154181221   normal
chrX    154181221   154181321   normal
chrX    154181321   154181421   loss
chrX    154181421   154181521   loss
chrX    154181521   154181621   loss
chrX    154181621   154181721   loss
chrX    154181721   154216867   loss
chrX    154216867   154216967   normal
chrX    154216967   154217067   normal
chrX    154217067   154217167   normal

If at least 5 continuous rows have same value in “CHR” column and “TYPE” column, then combine all those rows in one row so that START column should have value of first row and END column have value of last row and in the end just return rows which have “gain” or “loss” TYPE. So the desired output is:

chr22   17290691        17298575        gain
chrX    154181321       154216867       loss

What I am doing right now is:

  1. Saving the dataframe with “write.table”.
  2. Use this perl script:

      open $first, "<",$ARGV[0] or die "Unable to open input file: $!";
      my $count=1;
      $_ = <$first>;
      chomp;
      my ($p_key, $p_col1, $p_col2,$p_cnv) = split;
    
      while(<$first>) {
          chomp;
          my ($key, $col1, $col2,$cnv) = split;
          if ($key eq $p_key and $cnv eq  $p_cnv) {
            $p_col2 = $col2;
            $count++;
    
          } elsif ($count > 4){
    
    
             print $p_key,"\t", $p_col1,"\t", $p_col2,"\t", $p_cnv,"\n" if($p_cnv eq "gain" or $p_cnv eq "loss");
             ($p_key, $p_col1, $p_col2, $p_cnv) = ($key, $col1, $col2, $cnv);
             $count=1;
            }
    
           else { 
    
        ($p_key, $p_col1, $p_col2, $p_cnv) = ($key, $col1, $col2, $cnv);
            $count=1;
           }
    }
    

I think this is an extra step to save the dataframe first and then use Perl script. Could anyone please suggest an easier way to do this in R – any package or any other trick?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-11T22:19:16+00:00Added an answer on June 11, 2026 at 10:19 pm

    I was concerned that you should want to interrupt the sequences (i.e. consider them as distinct ) if there were intervening alternate values for TYPE within one chromosome. You didn’t specifically state it as such but I think the biology would warrant that additional requirement. Hence the need for another variable to be created. We will assume the dataframe is named cdat, in the absence of advice to the contrary. This looks within consecutive runs of TYPE, applies the test, and binds the CHR and START at the beginning and the END and TYPE for the last element.

    cdat$conseq <-cumsum(c(1, cdat$TYPE[-1] != cdat$TYPE[-length(cdat$TYPE)] ) )
    do.call( rbind, 
        by(cdat, list(cdat$CHR, cdat$conseq), 
             function(df)
                if( NROW(df) >=5 & df$TYPE[1] %in% c("gain", "loss") ) {
                    cbind(df[1, c("CHR", "START")] , df[NROW(df), c("END", "TYPE")] ) 
                    } else{NULL} ) )
         CHR     START       END TYPE
    10 chr22  17290691  17298575 gain
    23  chrX 154181321 154216867 loss
    

    The conseq vector is built up by comparing the next TYPE value to its prior value and cumsum()-ing the appearance of a new value along its full length. Since those variables are one element shorter. the 1 is added as a placeholder at the beginning to let it line up with the dataframe.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have one python script which i am trying to convert and stuck in
I am trying to convert a file path which just have one slash to
I'm trying to convert this line into Batch how would one start this exe
I'm trying to convert a Perl script to python, and it uses quite a
I'm trying to convert JSON to C# object using Json.NET. The object looks like
I am trying to convert some code from perl to php. Perl code looks
I am trying to convert one file format to another using PyODConverter(DocumentConverter.py) I have
I am trying to convert one of DB table row in to column and
HI all I have problem when trying to convert list collection string to one
Im trying to convert a general tree (tree that have one or more children

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.