I’m attempting to extract not numeric values from a matrix like this:
32540_at 0.138306 78047_s_at 0.133885 81737_at 0.163546 81811_at 0.181725 AAGAB 0.157073 AARSD1 0.114351
(the file contains rows of different length but each time the name is followed by a number)
Specifically, the output I need is the following:
32540_at 78047_s_at 81737_at 81811_at AAGAB AARSD1
Since it is too difficult to me (due to my inexperience in Unix programming) to extract the alpha numeric characters due to the structure of names like 81737_at I’m attempting to extract differentially the not numeric characters from the numeric ones.
That is, removing the numeric ones, the not numeric fields will stay there directly.
How this can be done?
Best,
Eleonora
With sensible
RSandORSsettings, this is fairly straight forward withawk:Output:
Explanation
RS=' +|\n': Separate each record with whitespace or newline.ORS=' ': Inserts a space after each record printed.!/^[0-9.]+$/: If record doesn’t only contain numbers and a dots, print it. A more correct number pattern would be (no considering scientific notation):!/^([0-9]+\.[0-9]*|[0-9]*\.[0-9]+|[0-9]+\.?)$.