Sponsored Content
Top Forums Shell Programming and Scripting Select lines where at least x columns above threshold value Post 302780483 by Don Cragun on Thursday 14th of March 2013 02:39:20 PM
Old 03-14-2013
You could try something like:
Code:
#!/bin/ksh
# SYNOPSIS:
# colcheck [file [first_column [last_column [threshhold [pass_count]]]]]
# DESCRIPTION:
# Print all lines in the file named by "file" (default file is input) in which
# at least "pass_count" (default value 8) values in columns "first_column"
# (default value 6) through "last_column" (default value 20) are greater than or
# equal to "threshold" (default value 0.75).
file=${1:-input}
fc=${2:-6}
lc=${3:-20}
threshold=${4:-0.75}
pass_count=${5:-8}
awk -v f="$fc" -v l="$lc" -v t="$threshold" -v p="$pass_count" '
{       c = p
        for(i = f; i <= l && c; i++) if($i >= t) c--
        if(c == 0) print
}' "$file"

If you are using a Solaris/SunOS system, use /usr/xpg4/bin/awk or nawk instead of awk.

I use the Korn shell, but this should also work with any other shell that accepts Bourne shell syntax (such as bash).

Last edited by Don Cragun; 03-14-2013 at 03:41 PM.. Reason: Fix typo in a comment.
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Select and display sum depending upon even columns

i have a input as : 2898 | homy | pune | 7/4/09 1 :6298 | anna | chennai | 7/4/08 2 :3728 | gonna | kol | 8/2/10 3 :3987 | hogja | mumbai | 8/5/09 4 :6187 | galma | london | 9/5/01 5 :9167 | tamina | ny | 8/3/10 6 :3981 | dastan | bagh | 8/2/07 7 :4617 | vazir | ny now,i want to get... (2 Replies)
Discussion started by: adityamitra
2 Replies

2. Shell Programming and Scripting

Select and display sum depending upon even columns

Select and display sum depending upon even columns i have a input as : 2898 | homy | pune | 7/4/09 1 :6298 | anna | chennai | 7/4/08 2 :3728 | gonna | kol | 8/2/10 3 :3987 | hogja | mumbai | 8/5/09 4 :6187 | galma | london | 9/5/01 5 :9167 | tamina | ny | 8/3/10 6 :3981 | dastan | bagh |... (1 Reply)
Discussion started by: adityamitra
1 Replies

3. Shell Programming and Scripting

[Solved] Select the columns which have value greater than particular number

i have a file of the form 9488 14392 1 1.8586e-07 5702 7729 1 1.8586e-07 9048 14018 1 1.8586e-07 5992 12556 1 1.8586e-07 9488 14393 1 1.8586e-07 9048 14019 1 1.8586e-07 5992 12557 1 1.8586e-07 9488 14394 ... (1 Reply)
Discussion started by: vaibhavkorde
1 Replies

4. Shell Programming and Scripting

Select lines in which column have value greater than some percent of total file lines

i have a file in following format 1 32 3 4 6 4 4 45 1 45 4 61 54 66 4 5 65 51 56 65 1 12 32 85 now here the total number of lines are 8(they vary each time) Now i want to select only those lines in which the values... (6 Replies)
Discussion started by: vaibhavkorde
6 Replies

5. Shell Programming and Scripting

Select columns from a matrix given within a range in BASH

I have a huge matrix file which looks like this (example matrix): 1 2 3 5 4 5 6 7 7 6 8 9 1 2 4 2 7 6 5 1 3 2 1 9 As one can see, this matrix has 4 columns and 6 rows. But my original matrix has some 3 million rows and 6000 columns. For example, on this matrix I can define my task as... (2 Replies)
Discussion started by: shoaibjameel123
2 Replies

6. UNIX for Dummies Questions & Answers

help! script to select line with greatest value 2 between columns

Hi, I’m trying to do something I haven’t done before and I’m struggling with how to even create the command or script. I have the following space delim file: gene accession chr chr_st begin end NN1 NC_024540 chr3 - 14000 14020 NN1 ... (10 Replies)
Discussion started by: wolf_blue
10 Replies

7. Shell Programming and Scripting

sum of a column and selecting lines with value above threshold

Hi again, I need to further process the results of a previous manipulation. I have a file with three columns e.g. AAA5 0.00175 1.97996e-06 AAA5 0.01334 2.14159e-05 AAA5 0.01340 4.12155e-05 AAA5 0.01496 1.10312e-05 AAA5 0.51401 0.0175308 BB0 0.00204 2.8825e-07 BB0 0.01569 7.94746e-07 BB0... (6 Replies)
Discussion started by: f_o_555
6 Replies

8. Shell Programming and Scripting

Select all the even columns from a file

Hi, I can select all the even columns from a file like this: awk '{ for (i=1;i<=NF;i+=2) $i="" }1' file > new file How can I select the 1st and all the even columns using awk? Thanks! (1 Reply)
Discussion started by: forU
1 Replies

9. Shell Programming and Scripting

How do I select certain columns with matching pattern and rest of the lines?

I want to select 2nd, 3rd columns if line has "key3" and print rest of the lines as is. # This is my sample input key1="val1" key2="val2" key3="val3" key4="val4" some text some text some text some text key1="val1" key2="val2" key3="val3" key4="val4" some text some text some text some... (3 Replies)
Discussion started by: kchinnam
3 Replies

10. UNIX for Beginners Questions & Answers

How to select rows that have opposite values (A vs B, or B vs A) on first two columns?

I have a dateset like this: Gly1 Gly2 2 1 0 Gly3 Gly4 3 4 5 Gly3 Gly5 1 3 2 Gly2 Gly1 3 6 2 Gly4 Gly3 2 2 1 Gly6 Gly4 4 2 1what I expected is: Gly1 Gly2 2 1 0 Gly2 Gly1 3 6 2 Gly3 Gly4 3 4 5 Gly4 Gly3 2 2 1 A vs B, or B vs A are the same... (7 Replies)
Discussion started by: nengcheng
7 Replies
getcol(1)						      General Commands Manual							 getcol(1)

Name
       getcol - Extract specified columns from an ASCII table file

Synopsis
       getcol [-amv][-n num][-r lines][-s num] filename [column number range]

Description
       Extract specified columns from an ASCII table file

Options
       filename
	      Name  of a ASCII table file.  At least one of these must be present for any values to be printed.  If it is stdin or STDIN, an ASCII
	      table is expected as standard input.  If there is no input file, standard input is assumed.

       @filename
	      Name of a file containing a list of ASCII table files.  If this is present, any other  file  names  on  the  command  line  will	be
	      ignored.

       field range
	      Print  value  of	these  columns for the number of lines of the table specified by the -n argument after the skippiing the number of
	      lines specified by the -s argument.  A value of 0 causes the entire input line to be printed.

       -a     Sum all numeric columns selected, printing the sum on the line following the result.  Columns with  no  sum  are	filled	with  ___.
	      (Added in version 2.6.9)

       -b     Input is bar-separate table file

       -c     Add count of number of lines in each column at end

       -d <number>
	      Number of decimal places in f.p. output

       -e     Compute medians of selected columns

       -f     Print range of values in selected columns

       -h     Print Starbase tab table header

       -i     Input is tab-separate table file

       -k     Print number of columns on first line

       -l <number>
	      Number of lines to add to each line

       -m     Compute the means of all numeric columns selected, printing the mean on the line following the result (or the line following the sum
	      if -a is used).  Columns with no mean are filled with ___.  (Added in version 2.6.9)

       -n num Print selected columns for this many lines.  If not specified, all lines will be read after the number of lines specified by -s have
	      been skipped.

       -o     OR conditions insted of ANDing them

       -p     Print only sum, mmean, sigma, median, or range, not entries

       -r @listfile
	      -r  line	range  Print  columns from the lines specified as either the first nonzero number on each line of the file listfile or the
	      comma- and hyphen- delimitied range; i.e. 1-5,10-12 will print values from lines 1, 2, 3, 4, 5, 10, 11, and 12.  (added  in  version
	      2.6.12)

       -s num Skip this many line before starting to print values.  If not specified, no lines will be skipped.

       -t     Starbase (tab-separated) table output

       -v     Print more information about process.

       Web Page
	      http://tdc-www.harvard.edu/software/wcstools/getcol.html

Author
       Doug Mink, SAO (dmink@cfa.harvard.edu)

8 November 2001 						     WCSTools								 getcol(1)
All times are GMT -4. The time now is 02:49 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy