Sponsored Content
Top Forums Shell Programming and Scripting Select lines where at least x columns above threshold value Post 302780579 by RudiC on Thursday 14th of March 2013 05:41:10 PM
Old 03-14-2013
In your sample code, you don't have identical thresholds for the columns, but in your spec, you do. I'll assume the latter, as it's easier for a start.
For playing around, it might be best to have all parameters as variables:
Code:
$ awk '{cnt=0; for (i=FST; i<=LST; i++) cnt+=($i>THR)} cnt>=MIN' FST=6 LST=20 THR=0.75 MIN=8 file
s_20477    162    1    1.000    6.0    0.20987654    0.79423868    0.81481481 etc . . .

or, shamelessly stealing Don Cragun's ideas, this should do as well:
Code:
d$ awk '{cnt=MIN; for (i=FST; i<=LST && cnt; i++) cnt-=($i>THR)} !cnt' FST=6 LST=20 THR=0.75 MIN=8 file
s_20477    162    1    1.000    6.0    0.20987654    0.79423868    0.81481481    0.78395062 etc . . .

If you want exactly MIN columns to exceed the threshold, remove the && cnt in the for (...).

Last edited by RudiC; 03-14-2013 at 06:47 PM..
This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Select and display sum depending upon even columns

i have a input as : 2898 | homy | pune | 7/4/09 1 :6298 | anna | chennai | 7/4/08 2 :3728 | gonna | kol | 8/2/10 3 :3987 | hogja | mumbai | 8/5/09 4 :6187 | galma | london | 9/5/01 5 :9167 | tamina | ny | 8/3/10 6 :3981 | dastan | bagh | 8/2/07 7 :4617 | vazir | ny now,i want to get... (2 Replies)
Discussion started by: adityamitra
2 Replies

2. Shell Programming and Scripting

Select and display sum depending upon even columns

Select and display sum depending upon even columns i have a input as : 2898 | homy | pune | 7/4/09 1 :6298 | anna | chennai | 7/4/08 2 :3728 | gonna | kol | 8/2/10 3 :3987 | hogja | mumbai | 8/5/09 4 :6187 | galma | london | 9/5/01 5 :9167 | tamina | ny | 8/3/10 6 :3981 | dastan | bagh |... (1 Reply)
Discussion started by: adityamitra
1 Replies

3. Shell Programming and Scripting

[Solved] Select the columns which have value greater than particular number

i have a file of the form 9488 14392 1 1.8586e-07 5702 7729 1 1.8586e-07 9048 14018 1 1.8586e-07 5992 12556 1 1.8586e-07 9488 14393 1 1.8586e-07 9048 14019 1 1.8586e-07 5992 12557 1 1.8586e-07 9488 14394 ... (1 Reply)
Discussion started by: vaibhavkorde
1 Replies

4. Shell Programming and Scripting

Select lines in which column have value greater than some percent of total file lines

i have a file in following format 1 32 3 4 6 4 4 45 1 45 4 61 54 66 4 5 65 51 56 65 1 12 32 85 now here the total number of lines are 8(they vary each time) Now i want to select only those lines in which the values... (6 Replies)
Discussion started by: vaibhavkorde
6 Replies

5. Shell Programming and Scripting

Select columns from a matrix given within a range in BASH

I have a huge matrix file which looks like this (example matrix): 1 2 3 5 4 5 6 7 7 6 8 9 1 2 4 2 7 6 5 1 3 2 1 9 As one can see, this matrix has 4 columns and 6 rows. But my original matrix has some 3 million rows and 6000 columns. For example, on this matrix I can define my task as... (2 Replies)
Discussion started by: shoaibjameel123
2 Replies

6. UNIX for Dummies Questions & Answers

help! script to select line with greatest value 2 between columns

Hi, I’m trying to do something I haven’t done before and I’m struggling with how to even create the command or script. I have the following space delim file: gene accession chr chr_st begin end NN1 NC_024540 chr3 - 14000 14020 NN1 ... (10 Replies)
Discussion started by: wolf_blue
10 Replies

7. Shell Programming and Scripting

sum of a column and selecting lines with value above threshold

Hi again, I need to further process the results of a previous manipulation. I have a file with three columns e.g. AAA5 0.00175 1.97996e-06 AAA5 0.01334 2.14159e-05 AAA5 0.01340 4.12155e-05 AAA5 0.01496 1.10312e-05 AAA5 0.51401 0.0175308 BB0 0.00204 2.8825e-07 BB0 0.01569 7.94746e-07 BB0... (6 Replies)
Discussion started by: f_o_555
6 Replies

8. Shell Programming and Scripting

Select all the even columns from a file

Hi, I can select all the even columns from a file like this: awk '{ for (i=1;i<=NF;i+=2) $i="" }1' file > new file How can I select the 1st and all the even columns using awk? Thanks! (1 Reply)
Discussion started by: forU
1 Replies

9. Shell Programming and Scripting

How do I select certain columns with matching pattern and rest of the lines?

I want to select 2nd, 3rd columns if line has "key3" and print rest of the lines as is. # This is my sample input key1="val1" key2="val2" key3="val3" key4="val4" some text some text some text some text key1="val1" key2="val2" key3="val3" key4="val4" some text some text some text some... (3 Replies)
Discussion started by: kchinnam
3 Replies

10. UNIX for Beginners Questions & Answers

How to select rows that have opposite values (A vs B, or B vs A) on first two columns?

I have a dateset like this: Gly1 Gly2 2 1 0 Gly3 Gly4 3 4 5 Gly3 Gly5 1 3 2 Gly2 Gly1 3 6 2 Gly4 Gly3 2 2 1 Gly6 Gly4 4 2 1what I expected is: Gly1 Gly2 2 1 0 Gly2 Gly1 3 6 2 Gly3 Gly4 3 4 5 Gly4 Gly3 2 2 1 A vs B, or B vs A are the same... (7 Replies)
Discussion started by: nengcheng
7 Replies
getcol(1)						      General Commands Manual							 getcol(1)

Name
       getcol - Extract specified columns from an ASCII table file

Synopsis
       getcol [-amv][-n num][-r lines][-s num] filename [column number range]

Description
       Extract specified columns from an ASCII table file

Options
       filename
	      Name  of a ASCII table file.  At least one of these must be present for any values to be printed.  If it is stdin or STDIN, an ASCII
	      table is expected as standard input.  If there is no input file, standard input is assumed.

       @filename
	      Name of a file containing a list of ASCII table files.  If this is present, any other  file  names  on  the  command  line  will	be
	      ignored.

       field range
	      Print  value  of	these  columns for the number of lines of the table specified by the -n argument after the skippiing the number of
	      lines specified by the -s argument.  A value of 0 causes the entire input line to be printed.

       -a     Sum all numeric columns selected, printing the sum on the line following the result.  Columns with  no  sum  are	filled	with  ___.
	      (Added in version 2.6.9)

       -b     Input is bar-separate table file

       -c     Add count of number of lines in each column at end

       -d <number>
	      Number of decimal places in f.p. output

       -e     Compute medians of selected columns

       -f     Print range of values in selected columns

       -h     Print Starbase tab table header

       -i     Input is tab-separate table file

       -k     Print number of columns on first line

       -l <number>
	      Number of lines to add to each line

       -m     Compute the means of all numeric columns selected, printing the mean on the line following the result (or the line following the sum
	      if -a is used).  Columns with no mean are filled with ___.  (Added in version 2.6.9)

       -n num Print selected columns for this many lines.  If not specified, all lines will be read after the number of lines specified by -s have
	      been skipped.

       -o     OR conditions insted of ANDing them

       -p     Print only sum, mmean, sigma, median, or range, not entries

       -r @listfile
	      -r  line	range  Print  columns from the lines specified as either the first nonzero number on each line of the file listfile or the
	      comma- and hyphen- delimitied range; i.e. 1-5,10-12 will print values from lines 1, 2, 3, 4, 5, 10, 11, and 12.  (added  in  version
	      2.6.12)

       -s num Skip this many line before starting to print values.  If not specified, no lines will be skipped.

       -t     Starbase (tab-separated) table output

       -v     Print more information about process.

       Web Page
	      http://tdc-www.harvard.edu/software/wcstools/getcol.html

Author
       Doug Mink, SAO (dmink@cfa.harvard.edu)

8 November 2001 						     WCSTools								 getcol(1)
All times are GMT -4. The time now is 05:49 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy