Sponsored Content
Top Forums Shell Programming and Scripting How to get min and max values using awk? Post 302911767 by Don Cragun on Saturday 2nd of August 2014 11:51:44 PM
Old 08-03-2014
Quote:
Originally Posted by redse171
Hi,

I need your kind help to get min and max values from file based on value in $5 .

File1
Code:
SP12.3	stc	2240806	2240808	+	ID1 N003	 ID2 N003T0
SP12.3	sto	2241682	2241684	+	ID1 N003	 ID2 N003T0
SP12.3	XE	2239943	2240011	+	ID1 N003	 ID2 N003T0
SP12.3	XE	2240077	2241254	+	ID1 N003	 ID2 N003T0
SP12.3	CD	2240806	2241254	+	ID1 N003	 ID2 N003T0
SP12.3	XE	2241471	2241684	+	ID1 N003	 ID2 N003T0
SP12.3	CD	2241471	2241681	+	ID1 N003	 ID2 N003T0
SP12.3	stc	2245127	2245129	+	ID1 N005	 ID2 N005T0
SP12.3	sto	2246954	2246956	+	ID1 N005	 ID2 N005T0
SP12.3	XE	2244762	2247195	+	ID1 N005	 ID2 N005T0
SP12.3	CD	2245127	2246953	+	ID1 N005	 ID2 N005T0
SP12.3	stc	2253115	2253117	-	ID1 N006	 ID2 N006T0
SP12.3	sto	2249759	2249761	-	ID1 N006	 ID2 N006T0
SP12.3	XE	2253090	2254054	-	ID1 N006	 ID2 N006T0
SP12.3	CD	2253090	2253117	-	ID1 N006	 ID2 N006T0
SP12.3	XE	2252492	2252908	-	ID1 N006	 ID2 N006T0
SP12.3	CD	2252492	2252908	-	ID1 N006	 ID2 N006T0
SP12.3	XE	2251730	2251882	-	ID1 N006	 ID2 N006T0
SP12.3	CD	2251730	2251882	-	ID1 N006	 ID2 N006T0
SP12.3	XE	2251591	2251664	-	ID1 N006	 ID2 N006T0
SP12.3	CD	2251591	2251664	-	ID1 N006	 ID2 N006T0
SP12.3	XE	2249887	2251530	-	ID1 N006	 ID2 N006T0
SP12.3	CD	2249887	2251530	-	ID1 N006	 ID2 N006T0
SP12.3	XE	2249087	2249821	-	ID1 N006	 ID2 N006T0
SP12.3	CD	2249762	2249821	-	ID1 N006	 ID2 N006T0
SP12.3	stc	2252073	2252075	-	ID1 N006	 ID2 N006T1
SP12.3	sto	2249759	2249761	-	ID1 N006	 ID2 N006T1
SP12.3	XE	2252492	2252973	-	ID1 N006	 ID2 N006T1
SP12.3	XE	2251730	2252227	-	ID1 N006	 ID2 N006T1
SP12.3	CD	2251730	2252075	-	ID1 N006	 ID2 N006T1
SP12.3	XE	2251591	2251664	-	ID1 N006	 ID2 N006T1
SP12.3	CD	2251591	2251664	-	ID1 N006	 ID2 N006T1
SP12.3	XE	2249887	2251530	-	ID1 N006	 ID2 N006T1
SP12.3	CD	2249887	2251530	-	ID1 N006	 ID2 N006T1
SP12.3	XE	2249090	2249821	-	ID1 N006	 ID2 N006T1
SP12.3	CD	2249762	2249821	-	ID1 N006	 ID2 N006T1

I did the following codes:-

Code:
awk -F"\t" '$2=="CD"{if ($5~/\+/) {print $1"\t"$3"\t"$4"\t"$5"\t"$6"\t"$7} else {print $1"\t"$4"\t"$3"\t"$5"\t"$6"\t"$7}}' file1

But the results still shows all lines containing "CD" pattern. The real output that i want will only show min and max value based on $5 ((blue color for "+" and red color for "-") as below. :-

Code:
SP12.3	CD	2240806	2241681	+	ID1 N003	 ID2 N003T0
SP12.3	CD	2249762	2253117	-	ID1 N006	 ID2 N006T0
SP12.3	CD	2249762	2252075	-	ID1 N006	 ID2 N006T1

If there is only 1 CD for any ID2 ($7), the line will also be omitted. Would appreciate if you can help me on this. thanks
It is no wonder that the results you are getting are not what you want. Your description of how to process the input is so vague that we do not understand what you want.

The code you showed us prints parts of every line with "CD" in the 2nd field. For those lines, it throws away fields 2, 8, and 9; and, if $5 is "+", it swaps fields 3 and 4 before printing the remainder of the line. But, the output you say you want shows every field (keeping fields 2, 8, and 9). And if fields 3 and 4 have been swapped, it isn't obvious to me.

You mentioned ID2 ($7), but it looks like you are looking for the minimum $3 value and the maximum $4 value for each different value in field 9 (not field 7). And from the data shown, I don't see that the + or - in field 5 makes any difference at all.

You have shown us data where fields 1, 6, and 8 are all constants. You have said that $1 may change, but you haven't given any indication of how, or if, that should affect the output produced.

Please give us a clear English description of what you are trying to do and explain what the meaning is for each of the fields in your file.

Also, lots of gene data that we're asked to help with has huge files to process. If that is the case here as well, any details you can give us about the data may help speed up the process considerably. For example, what you have shown us could be sorted with field 1, 5, or 9 as a primary sort key. If data is to be grouped using field 9 as a key and the input is sorted on field 9, we can produce any needed output every time the contents of field 9 changes (as opposed to accumulating all of the input into memory and processing everything at the end).

We also need to know up front whether or not it is important that the output be in the same order as the input.

And, finally: just saying that the code you were given did't give you accurate results is useless information. Show us the output you got, the output you wanted, and explain why (based on your description of what you wanted) the output you got was wrong! Help us help you!
These 3 Users Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

max values amd min values

Hello every one, I have following data ***CAMPAIGN 1998 CONTRIBUTIONS*** --------------------------------------------------------------------------- NAME PHONE Jan | Feb | Mar | Total Donated ... (12 Replies)
Discussion started by: devmiral
12 Replies

2. UNIX for Dummies Questions & Answers

Awk search for max and min field values

hi, i have an awk script and I managed to figure out how to search the max value but Im having difficulty in searching for the min field value. BEGIN {FS=","; max=0} NF == 7 {if (max < $6) max = $6;} END { print man, min} where $6 is the column of a field separated by a comma (3 Replies)
Discussion started by: Kirichiko
3 Replies

3. Shell Programming and Scripting

Find min.max value if matching columns found using AWK

Input_ File : 2 3 4 5 1 1 0 1 2 1 -1 1 2 1 3 1 3 1 4 1 6 5 6 6 6 6 6 7 6 7 6 8 5 8 6 7 Desired output : 2 3 4 5 -1 1 4 1 6 5 6 8 5 8 6 7 (3 Replies)
Discussion started by: vasanth.vadalur
3 Replies

4. Shell Programming and Scripting

AWK script - extracting min and max values from selected lines

Hi guys! I'm new to scripting and I need to write a script in awk. Here is example of file on which I'm working ATOM 4688 HG1 PRO A 322 18.080 59.680 137.020 1.00 0.00 ATOM 4689 HG2 PRO A 322 18.850 61.220 137.010 1.00 0.00 ATOM 4690 CD ... (18 Replies)
Discussion started by: grincz
18 Replies

5. UNIX for Dummies Questions & Answers

[Solved] Print a line using a max and a min values of different columns

Hi guys, I already search on the forum but i can't solve this on my own. I have a lot of files like this: And i need to print the line with the maximum value in last column but if the value is the same (2 in this exemple for the 3 last lines) i need get the line with the minimum value in... (4 Replies)
Discussion started by: MetaBolic0
4 Replies

6. Shell Programming and Scripting

Average, min and max in file with header, using awk

Hi, I have a file which looks like this: FID IID MISS_PHENO N_MISS N_GENO F_MISS 12AB43131 12AB43131 N 17774 906341 0.01961 65HJ87451 65HJ87451 N 10149 906341 0.0112 43JJ21345 43JJ21345 N 2826 906341 0.003118I would... (11 Replies)
Discussion started by: kayakj
11 Replies

7. Shell Programming and Scripting

Get the min avg and max with awk

aaa: 3 ms aaa: 2 ms aaa: 5 ms aaa: 10 ms .......... to get the 3 2 5 10 ...'s min avg and max something like min: 2 ms avg: 5 ms max: 10 ms (2 Replies)
Discussion started by: yanglei_fage
2 Replies

8. Shell Programming and Scripting

awk script to find min and max value

I need to find the max/min of columns 1 and 2 of a 2 column file what contains the special character ">". I know that this will find the max value of column 1. awk 'BEGIN {max = 0} {if ($1>max) max=$1} END {print max}' input.file But what if I needed to ignore special characters in the... (3 Replies)
Discussion started by: ncwxpanther
3 Replies

9. Shell Programming and Scripting

awk search for max and min while ignoring special character

I am trying to get a simple min/max script to work with the below input. Note the special character (">") within it. Script awk 'BEGIN{max=0}{if(($1)>max) max=($1)}END {print max}' awk 'BEGIN{min=0}{if(($2)<min) min=($2)}END {print min}' Input -122.2840 42.0009 -119.9950 ... (7 Replies)
Discussion started by: ncwxpanther
7 Replies

10. Shell Programming and Scripting

awk Sort 2d histogram output from min(X,Y) to max(X,Y)

I've got Gnuplot-format 2D histogram data output which looks as follows. 6.5 -1.25 10.2804 6.5404 -1.25 10.4907 6.58081 -1.25 10.8087 6.62121 -1.25 10.4686 6.66162 -1.25 10.506 6.70202 -1.25 10.3084 6.74242 -1.25 9.68256 6.78283 -1.25 9.41229 6.82323 -1.25 9.43078 6.86364 -1.25 9.62408... (1 Reply)
Discussion started by: chrisjorg
1 Replies
All times are GMT -4. The time now is 10:35 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy