Sponsored Content
Top Forums Shell Programming and Scripting How to get min and max values using awk? Post 302911736 by redse171 on Saturday 2nd of August 2014 05:03:02 PM
Old 08-02-2014
How to get min and max values using awk?

Hi,

I need your kind help to get min and max values from file based on value in $5 .

File1
Code:
SP12.3	stc	2240806	2240808	+	ID1_N003	 ID2_N003T0
SP12.3	sto	2241682	2241684	+	ID1_N003	 ID2_N003T0
SP12.3	XE	2239943	2240011	+	ID1_N003	 ID2_N003T0
SP12.3	XE	2240077	2241254	+	ID1_N003	 ID2_N003T0
SP12.3	CD	2240806	2241254	+	ID1_N003	 ID2_N003T0
SP12.3	XE	2241471	2241684	+	ID1_N003	 ID2_N003T0
SP12.3	CD	2241471	2241681	+	ID1_N003	 ID2_N003T0
SP12.3	stc	2245127	2245129	+	ID1_N005	 ID2_N005T0
SP12.3	sto	2246954	2246956	+	ID1_N005	 ID2_N005T0
SP12.3	XE	2244762	2247195	+	ID1_N005	 ID2_N005T0
SP12.3	CD	2245127	2246953	+	ID1_N005	 ID2_N005T0
SP12.3	stc	2253115	2253117	-	ID1_N006	 ID2_N006T0
SP12.3	sto	2249759	2249761	-	ID1_N006	 ID2_N006T0
SP12.3	XE	2253090	2254054	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2253090	2253117	-	ID1_N006	 ID2_N006T0
SP12.3	XE	2252492	2252908	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2252492	2252908	-	ID1_N006	 ID2_N006T0
SP12.3	XE	2251730	2251882	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2251730	2251882	-	ID1_N006	 ID2_N006T0
SP12.3	XE	2251591	2251664	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2251591	2251664	-	ID1_N006	 ID2_N006T0
SP12.3	XE	2249887	2251530	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2249887	2251530	-	ID1_N006	 ID2_N006T0
SP12.3	XE	2249087	2249821	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2249762	2249821	-	ID1_N006	 ID2_N006T0
SP12.3	stc	2252073	2252075	-	ID1_N006	 ID2_N006T1
SP12.3	sto	2249759	2249761	-	ID1_N006	 ID2_N006T1
SP12.3	XE	2252492	2252973	-	ID1_N006	 ID2_N006T1
SP12.3	XE	2251730	2252227	-	ID1_N006	 ID2_N006T1
SP12.3	CD	2251730	2252075	-	ID1_N006	 ID2_N006T1
SP12.3	XE	2251591	2251664	-	ID1_N006	 ID2_N006T1
SP12.3	CD	2251591	2251664	-	ID1_N006	 ID2_N006T1
SP12.3	XE	2249887	2251530	-	ID1_N006	 ID2_N006T1
SP12.3	CD	2249887	2251530	-	ID1_N006	 ID2_N006T1
SP12.3	XE	2249090	2249821	-	ID1_N006	 ID2_N006T1
SP12.3	CD	2249762	2249821	-	ID1_N006	 ID2_N006T1
SP12.5	stc	3001307	3001309	+	ID1_N01140	ID2_N01140T0
SP12.5	sto	3005026	3005028	+	ID1_N01140	ID2_N01140T0
SP12.5	XE	3000439	3001397	+	ID1_N01140	ID2_N01140T0
SP12.5	CD	3001307	3001397	+	ID1_N01140	ID2_N01140T0
SP12.5	XE	3001572	3002765	+	ID1_N01140	ID2_N01140T0
SP12.5	CD	3001572	3002765	+	ID1_N01140	ID2_N01140T0
SP12.5	XE	3002821	3004797	+	ID1_N01140	ID2_N01140T0
SP12.5	CD	3002821	3004797	+	ID1_N01140	ID2_N01140T0
SP12.5	XE	3004855	3004929	+	ID1_N01140	ID2_N01140T0
SP12.5	CD	3004855	3004929	+	ID1_N01140	ID2_N01140T0
SP12.5	XE	3004994	3005417	+	ID1_N01140	ID2_N01140T0
SP12.5	CD	3004994	3005025	+	ID1_N01140	ID2_N01140T0

I did the following codes:-

Code:
awk -F"\t" '$2=="CD"{if ($5~/\+/) {print $1"\t"$3"\t"$4"\t"$5"\t"$6"\t"$7} else {print $1"\t"$4"\t"$3"\t"$5"\t"$6"\t"$7}}' file1

But the results shows all lines containing "CD" patterns like below:
Code:
SP12.3	CD	2240806	2241254	+	ID1_N003	 ID2_N003T0
SP12.3	CD	2241471	2241681	+	ID1_N003	 ID2_N003T0
SP12.3	CD	2245127	2246953	+	ID1_N005	 ID2_N005T0
SP12.3	CD	2253090	2253117	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2252492	2252908	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2251730	2251882	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2251591	2251664	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2249887	2251530	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2249762	2249821	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2251730	2252075	-	ID1_N006	 ID2_N006T1
SP12.3	CD	2251591	2251664	-	ID1_N006	 ID2_N006T1
SP12.3	CD	2249887	2251530	-	ID1_N006	 ID2_N006T1
SP12.3	CD	2249762	2249821	-	ID1_N006	 ID2_N006T1
SP12.5	CD	3001307	3001397	+	ID1_N01140	ID2_N01140T0
SP12.5	CD	3001572	3002765	+	ID1_N01140	ID2_N01140T0
SP12.5	CD	3002821	3004797	+	ID1_N01140	ID2_N01140T0
SP12.5	CD	3004855	3004929	+	ID1_N01140	ID2_N01140T0
SP12.5	CD	3004994	3005025	+	ID1_N01140	ID2_N01140T0


The real output that i want will only show min and max value if "CD" pattern is found, and it should be based on value in $5. If "+", then the value in $3 for the first "CD" found and value in $4 for the last "CD" found for each ID2 ($6) will be printed in $3 and $4 of output file respectively. If "-", then the value in $4 for the first "CD" found and value in $3 for the last "CD" found for each ID2($6) will be printed in $4 and $3 respectively like below:-

Code:
SP12.3	CD	2240806	2241681	+	ID1_N003	 ID2_N003T0
SP12.3	CD	2249762	2253117	-	ID1_N006	 ID2_N006T0
SP12.3	CD	2249762	2252075	-	ID1_N006	 ID2_N006T1
SP12.5	CD	3001307	3005025	+	ID1_N01140	ID2_N01140T0

If there is only 1 CD for any ID2 ($7), the line will also be omitted. Would appreciate if you can help me on this. thanks

Last edited by redse171; 08-03-2014 at 06:20 PM.. Reason: for better sample and description
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

max values amd min values

Hello every one, I have following data ***CAMPAIGN 1998 CONTRIBUTIONS*** --------------------------------------------------------------------------- NAME PHONE Jan | Feb | Mar | Total Donated ... (12 Replies)
Discussion started by: devmiral
12 Replies

2. UNIX for Dummies Questions & Answers

Awk search for max and min field values

hi, i have an awk script and I managed to figure out how to search the max value but Im having difficulty in searching for the min field value. BEGIN {FS=","; max=0} NF == 7 {if (max < $6) max = $6;} END { print man, min} where $6 is the column of a field separated by a comma (3 Replies)
Discussion started by: Kirichiko
3 Replies

3. Shell Programming and Scripting

Find min.max value if matching columns found using AWK

Input_ File : 2 3 4 5 1 1 0 1 2 1 -1 1 2 1 3 1 3 1 4 1 6 5 6 6 6 6 6 7 6 7 6 8 5 8 6 7 Desired output : 2 3 4 5 -1 1 4 1 6 5 6 8 5 8 6 7 (3 Replies)
Discussion started by: vasanth.vadalur
3 Replies

4. Shell Programming and Scripting

AWK script - extracting min and max values from selected lines

Hi guys! I'm new to scripting and I need to write a script in awk. Here is example of file on which I'm working ATOM 4688 HG1 PRO A 322 18.080 59.680 137.020 1.00 0.00 ATOM 4689 HG2 PRO A 322 18.850 61.220 137.010 1.00 0.00 ATOM 4690 CD ... (18 Replies)
Discussion started by: grincz
18 Replies

5. UNIX for Dummies Questions & Answers

[Solved] Print a line using a max and a min values of different columns

Hi guys, I already search on the forum but i can't solve this on my own. I have a lot of files like this: And i need to print the line with the maximum value in last column but if the value is the same (2 in this exemple for the 3 last lines) i need get the line with the minimum value in... (4 Replies)
Discussion started by: MetaBolic0
4 Replies

6. Shell Programming and Scripting

Average, min and max in file with header, using awk

Hi, I have a file which looks like this: FID IID MISS_PHENO N_MISS N_GENO F_MISS 12AB43131 12AB43131 N 17774 906341 0.01961 65HJ87451 65HJ87451 N 10149 906341 0.0112 43JJ21345 43JJ21345 N 2826 906341 0.003118I would... (11 Replies)
Discussion started by: kayakj
11 Replies

7. Shell Programming and Scripting

Get the min avg and max with awk

aaa: 3 ms aaa: 2 ms aaa: 5 ms aaa: 10 ms .......... to get the 3 2 5 10 ...'s min avg and max something like min: 2 ms avg: 5 ms max: 10 ms (2 Replies)
Discussion started by: yanglei_fage
2 Replies

8. Shell Programming and Scripting

awk script to find min and max value

I need to find the max/min of columns 1 and 2 of a 2 column file what contains the special character ">". I know that this will find the max value of column 1. awk 'BEGIN {max = 0} {if ($1>max) max=$1} END {print max}' input.file But what if I needed to ignore special characters in the... (3 Replies)
Discussion started by: ncwxpanther
3 Replies

9. Shell Programming and Scripting

awk search for max and min while ignoring special character

I am trying to get a simple min/max script to work with the below input. Note the special character (">") within it. Script awk 'BEGIN{max=0}{if(($1)>max) max=($1)}END {print max}' awk 'BEGIN{min=0}{if(($2)<min) min=($2)}END {print min}' Input -122.2840 42.0009 -119.9950 ... (7 Replies)
Discussion started by: ncwxpanther
7 Replies

10. Shell Programming and Scripting

awk Sort 2d histogram output from min(X,Y) to max(X,Y)

I've got Gnuplot-format 2D histogram data output which looks as follows. 6.5 -1.25 10.2804 6.5404 -1.25 10.4907 6.58081 -1.25 10.8087 6.62121 -1.25 10.4686 6.66162 -1.25 10.506 6.70202 -1.25 10.3084 6.74242 -1.25 9.68256 6.78283 -1.25 9.41229 6.82323 -1.25 9.43078 6.86364 -1.25 9.62408... (1 Reply)
Discussion started by: chrisjorg
1 Replies
All times are GMT -4. The time now is 11:57 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy