I’ve started using a linux machine and I’m trying to do simple things…but very hard for me.
I need to select some specific lines that match with those reported in a second file.
Pratically I have a first file that is made like this:
>aba19 EN1 enl.or11 http://mar2043 annotation not avaliable
MASESEMGVVASJDHAGISFGVDDASDASDAFGDFGHWFACFQLIGIFLAYCLSRAITNN
QSDHKAJSDHKASJHKJAHKHKJSDGHYEIV
>clat38 EN2 enl.o http://mar20s/Gene/Summary?5 annotation not avaliable
MNCEDCHILNAEAFKSKKDASDADICKSLKICGLVFGILALTLIVLFWGSKHFWPEVPKK
AYDMEHTFYSNGERGYCCASDSDDIYCSDRRGNRYCRRVCEPLLGYYPYPYCYQGGRVIC
RVIMPCDASDASDAOPWEIPQWFHNDJBVHAOISDOUIAODGNWWVARMLGRV
>coll9 EN4 ens4 http://mar2010.arch/Genary?g=E9 annotation not avaliable
MASKALDHLFKLJLÒFJASDJKLASDLAFJLFJFJLFJLAJFLKJFLAKFJFJLAFJLAL
ASDLASKDJASLKDJASLKJFALSKDJALKDJSKLDJLSDKJASLDKJSLDKSDLAKJKS
SILDUAISDALSDJALKDJASDLFATT
>hihi9 EN9 ens44 http://mar2010.ariens/Geary?g=EN7 annotation not avaliable
MGSLDLAÈPWOEMWBZMKSJDHAJKSDHAKSDHSDHSDHOASDAKSJDHKASJDHAAKHL
KTLSDKLHRFSDFHPHFGCJLJLJRKKFLDSFCGTVGEFAGGGDTHNNVCLSSVFVSEDG
HSDFSDWFKLGGMETVCSDFKVSQATPEFSSSDLFFDSRIQSIRDPASIPPEEMSPEFTT
LPECHGHARDAFSFGTLVESLLTILNEQVSADVLSSFQQTLHSTLLNPIPKCRPALCTLL
SDFLSDJFKLSDFLSKDFJM
And I have a second file with the list of patterns that I need to “extract” from the first file. The second file look like this:
>clat38
>coll9
Pratically I would like to have an output like this:
>clat38 EN2 enl.o http://mar20s/Gene/Summary?5 annotation not avaliable
MNCEDCHILNAEAFKSKKDASDADICKSLKICGLVFGILALTLIVLFWGSKHFWPEVPKK
AYDMEHTFYSNGERGYCCASDSDDIYCSDRRGNRYCRRVCEPLLGYYPYPYCYQGGRVIC
RVIMPCDASDASDAOPWEIPQWFHNDJBVHAOISDOUIAODGNWWVARMLGRV
>coll9 EN4 ens4 http://mar2010.arch/Genary?g=E9 annotation not avaliable
MASKALDHLFKLJLÒFJASDJKLASDLAFJLFJFJLFJLAJFLKJFLAKFJFJLAFJLAL
ASDLASKDJASLKDJASLKJFALSKDJALKDJSKLDJLSDKJASLDKJSLDKSDLAKJKS
SILDUAISDALSDJALKDJASDLFATT
I tryed grep -f file_2 file_1 > output but I get only this:
>clat38
>coll9
May I add something more specific to grep?
Thank you for any advice!
Gab
To search for a regular expression on a single line, use grep. Learn it from the man page and a couple of examples.
To substitute a string for a regular expression on a single line, use sed. Learn it from the man page and a couple of examples.
For all other text processing applications, use awk. Learn it from the book ” Effective Awk Programming, Third Edition” By Arnold Robbins, http://www.oreilly.com/catalog/awkprog3/.
If you want to print out more than 2 lines when you find the key you want, just change the value of c to 3 or 20 or whatever.
Given your comment below and your updated sample input, this should do what you want: