I’m not very good at this, perhaps you can help me?
I’ve successfully loaded a table into a hash, by pushing the keys and values as an array. (I realize my terminology might be incorrect, please feel free to correct it, you can see what I mean in my code below).
anyhow I’m having trouble printing the output like I want it. My objective is to make a separate file for each key, in fasta format(see below). Any Ideas?
Input:
a table:
Rank Query Name E-Value Frame Description Accession (to NCBI) Bits Fraction Identical (%) Fraction Conserved (%) HSP Length Query Length Hit Length Coverage Query (%) Coverage Hit (%) Query Start Query End Hit Start Hit End Query String
1 50085564 4e-16 0 rank=0087540 x=1133.0 y=3620.5 length=437 GXEIR0201C1TW2 76.1 60.3174603174603 77.7777777777778 63 149 437 42.2818791946309 14.41647597254 87 149 186 372 YDKANAFLNHGNYLAYGLAATTLWVLGIPHGFAVMHGKTRRGALVFDVADLVKDALVLPWAFI
2 50085564 7e-16 0 rank=0408491 x=1798.0 y=287.0 length=296 GX6ON9A01EN42P 74.8 62.7118644067797 79.6610169491525 59 149 296 39.5973154362416 19.9324324324324 91 149 51 225 NGFLNHGNYLAYGLAATTLWVLGIPHGFAVMHGKTRRGALVFDVADLVKDALVLPWAFI
3 50085564 2e-15 0 rank=0281898 x=768.0 y=1387.0 length=283 GX6ON9A01B5QL5 72.9 63.1578947368421 80.7017543859649 57 149 283 38.255033557047 20.1413427561837 93 149 51 219 FLNHGNYLAYGLAATTLWVLGIPHGFAVMHGKTRRGALVFDVADLVKDALVLPWAFI
4 50085564 3e-15 0 rank=0714663 x=648.0 y=2458.0 length=264 GXEIR0201BU76K 72.3 59.6491228070175 80.7017543859649 57 149 264 38.255033557047 21.5909090909091 93 149 79 247 FLNHGNYLAYGLAATTLWVLGIPHGFAVMHGKTRRGALIFDVADLVKDALILPWAFI
5 50085564 3e-14 0 rank=0643198 x=1035.0 y=163.0 length=398 GXEIR0201CS5IT 69.8 61.4035087719298 78.9473684210526 57 149 398 38.255033557047 14.321608040201 93 149 147 315 FLNHGNYLAYGLAATTLWVLGIPHGFAVMXGKTRRGALVFDVADLVKDALVLPWAFI
6 50085564 4e-09 0 rank=0641162 x=178.0 y=3351.0 length=287 GXEIR0201APZFT 52.3 54.7619047619048 76.1904761904762 42 149 287 28.1879194630872 14.6341463414634 1 42 11 134 PIANTTVILLGNGTSITQAAVRMLAQAGVLIGFCGGGGTPLY
7 50085564 4e-09 0 rank=0189408 x=1683.0 y=2055.0 length=418 GXEIR0201ED2ZD 52.8 45.3333333333333 68 75 149 418 50.3355704697987 17.9425837320574 1 75 64 340 PIANTTVILLGNGTSITQAAVRMLAQAGVLIGFCGGGGTPLYMGNAIEWLTPQSEYRPTEYLQGWLGFWFDDEQRLLTAKAMQHSRIDFLQKV
8 50085564 5e-07 0 rank=0324549 x=1541.5 y=2792.5 length=281 GX6ON9A01D1MRE 45.2 75.8620689655172 89.6551724137931 29 149 281 19.4630872483221 10.3202846975089 121 149 197 281 MXGKTRRGALVFDVADLVKDALVLPWAFI
9 50085564 6e-05 0 rank=0560234 x=126.0 y=2770.0 length=351 GXEIR0201ALEM8 38.7 42.6966292134831 59.5505617977528 89 149 351 59.7315436241611 25.3561253561254 30 124 57 345 LAGFDGDGLIPALDS---SRANID---RAMKTGDLLTSEAQLTKLLYKFAARSTT*KAL/YREHDATDKANGFLNHGNYLAYGLAATTLSG\LGIPHGFAVMHGK
The output I want : in seperate files (each accesion with it’s values in a seperate file, the example below is for one of such files)
>GXEIR0201C1TW2YDKANAFLNHGNYLAYGLAATTLWVLGIPHGFAVMHGKT\RRGALVFDVADLVKDALVLPWAFI
>GXEIR0201C1TW2NAFLNHGNYLAYGLAATTLWVLGIPHGF/AVMHGKTRRGALVFDVADLVKDALVLPWAF
>GXEIR0201C1TW2
YDKANAFLNHGNYLAYGLAATTLWVLGIPHGFA*MHGKTRRGALVFDVADLVKDALVLPWAFI`
(the same for the following accession; that I call $key in my script in a seperate file)
my script so far:
#!/usr/bin/perl -w
my $infile=$ARGV[0] or die ("File not opening\n");
#### first file is a list of the reads ... you want to feed this list into a hash to pick only unique ones.
open (LIST,$infile);
my %value=(); #declare the hash
my $rank; my $query; my $evalue; my $frame; my $description; my $key; my $bits;
my$fr_ident; my $fr_cons; my $query_leng; my $hit_leng; my $query_cov; my $hit_cov; my $query_start; my $query_end; my $hit_start; my $hit_end;
my $value;my $hsp_leng;
while (<LIST>)
{
($rank,$query,$evalue,$frame,$description,$key,$bits,$fr_ident,$hsp_leng,$fr_cons,$query_leng,$hit_leng,$query_cov,$hit_cov,$query_start,$query_end,$hit_start,$hit_end,$value) = split(/\t/); # split your input by a /t and enter each into a value// might not need all of these, but good for future ref
push (@{$value{$key}},$value);chomp (@{$value{$key}},$value); # make key and values for each entry
}
foreach $key (sort keys %value)
{
#print "KEY: $key , VALUE: $value , ELEMENT? \$value\{key\} : $value{$key}\n";
print "$key\n@{$value{$key}}\n"; #print all values of each key
#open (OUT, ">".$infile.$key"\_primary.fasta");
}
this gives me a nice output of the key and all it’s values, I want to use the key as a new file name for my last line, and have the stuff inside the file in the output shown below…
This is after 48 hours of trying, I’m really bad at this, tried reading online stuff but I’m not following it very well.
Thanks in advance.
If you put an array into a string it will join the values using the special variable
$"as the separator. So you’ll need to set that variable.Note the first arg to
print()is the filehandle to print to.An equivalent solution is to use the join() function.
print() can also take a list of items. In this case you need to set the special variable
$,with the value you want as the separator for your list values. Otherwise, the values in the list will just be printed with no separator since$,is undefined by default: