Select lines where at least x columns above threshold value
I have a file with 20 columns. I'd like to retain only the lines for which the values in at least x columns, looking only at columns 6-20, are above a threshold.
For example, I'd like to retain only the lines in the file below that have at least 8 columns (again, looking only at columns 6-20) with the value of at least 0.75. (I would like to be able to easily modify the code so that I could play around with the number of minimum columns (8 in this case) as well as the threshold (0.75)).
File:
Output:
I'm a novice and all I have so far is an awk command to set a threshold in individual columns, and then pipe that to another awk command screening another column. This obviously is inelegant as well as ineffective for allowing some columns to remain below the threshold.
Moderator's Comments:
code tags also for data files
Last edited by Scrutinizer; 03-14-2013 at 04:01 PM..
Reason: additional code tags
In your sample code, you don't have identical thresholds for the columns, but in your spec, you do. I'll assume the latter, as it's easier for a start.
For playing around, it might be best to have all parameters as variables:
or, shamelessly stealing Don Cragun's ideas, this should do as well:
If you want exactly MIN columns to exceed the threshold, remove the && cnt in the for (...).
I want to select 2nd, 3rd columns if line has "key3" and print rest of the lines as is.
# This is my sample input
key1="val1" key2="val2" key3="val3" key4="val4"
some text some text
some text some text
key1="val1" key2="val2" key3="val3" key4="val4"
some text some text
some text some... (3 Replies)
Hi,
I can select all the even columns from a file like this:
awk '{ for (i=1;i<=NF;i+=2) $i="" }1' file > new file
How can I select the 1st and all the even columns using awk? Thanks! (1 Reply)
Hi again,
I need to further process the results of a previous manipulation.
I have a file with three columns
e.g.
AAA5 0.00175 1.97996e-06
AAA5 0.01334 2.14159e-05
AAA5 0.01340 4.12155e-05
AAA5 0.01496 1.10312e-05
AAA5 0.51401 0.0175308
BB0 0.00204 2.8825e-07
BB0 0.01569 7.94746e-07
BB0... (6 Replies)
Hi,
I’m trying to do something I haven’t done before and I’m struggling with how to even create the command or script.
I have the following space delim file:
gene accession chr chr_st begin end
NN1 NC_024540 chr3 - 14000 14020
NN1 ... (10 Replies)
I have a huge matrix file which looks like this (example matrix):
1 2 3 5
4 5 6 7
7 6 8 9
1 2 4 2
7 6 5 1
3 2 1 9
As one can see, this matrix has 4 columns and 6 rows. But my original matrix has some 3 million rows and 6000 columns.
For example, on this matrix I can define my task as... (2 Replies)
i have a file in following format
1 32 3
4 6 4
4 45 1
45 4 61
54 66 4
5 65 51
56 65 1
12 32 85
now here the total number of lines are 8(they vary each time)
Now i want to select only those lines in which the values... (6 Replies)
Select and display sum depending upon even columns
i have a input as :
2898 | homy | pune | 7/4/09
1 :6298 | anna | chennai | 7/4/08
2 :3728 | gonna | kol | 8/2/10
3 :3987 | hogja | mumbai | 8/5/09
4 :6187 | galma | london | 9/5/01
5 :9167 | tamina | ny | 8/3/10
6 :3981 | dastan | bagh |... (1 Reply)