Hi redse171,
Thanks of rthe update. That gives us a better idea of what you are trying to do. Although the awk script you have shown us will not produce the output you showed us for the sample input you provided. (Your awk script doesn't copy the CD field to the output.)
I haven't dug into all of the details again yet, but I think that if we get answers to the following, we'll be able to help you write a script that will work:
Do you want the output to contain the "CD" field from the input?
Will all lines with the same combination of $5, $6, and $7 values be on contiguous lines in your input file? (The answer to this is "yes" for your sample input. Does it hold true for your real, huge input files?)
If the answer to #2 is no, does the order of lines in your output file matter?
This User Gave Thanks to Don Cragun For This Post:
Hi redse171,
Thanks of rthe update. That gives us a better idea of what you are trying to do. Although the awk script you have shown us will not produce the output you showed us for the sample input you provided. (Your awk script doesn't copy the CD field to the output.)
I haven't dug into all of the details again yet, but I think that if we get answers to the following, we'll be able to help you write a script that will work:
Do you want the output to contain the "CD" field from the input?
Will all lines with the same combination of $5, $6, and $7 values be on contiguous lines in your input file? (The answer to this is "yes" for your sample input. Does it hold true for your real, huge input files?)
If the answer to #2 is no, does the order of lines in your output file matter?
Hi Don Crugan,
To answer your questions:-
1. Yes, i need to have "CD" field in my output file as shown in my sample output
2. Yes for my huge input files
thanks.
---------- Post updated at 10:12 AM ---------- Previous update was at 10:07 AM ----------
Quote:
Originally Posted by RudiC
Answers to Don Cragun's above question may kill the assumptions on which this is based. Try
Hi RudiC,
Tried your codes and thanks so much for your explanations. It seems working for my real input file except that there are few lines a little bit weird. I am checking on it now and try play around with your codes. Will give the feedback asap. Thanks
---------- Post updated at 09:25 PM ---------- Previous update was at 10:12 AM ----------
Hi,
just to give feedback. The codes by RudiC is modified to suit my real data. The codes worked well with the sample data but there was an issue with the number and position of digits (values) in $3 and $4 in my real huge file. So, i split the LINE into segments and take the value from the segments (info from awk manual). Thanks to RudiC for the codes and explanations that help me to understand better. Below is the codes that being modified and i got the results that i wanted.
My first code was not informative enough as i don't have any idea how to find the min and max from my input file and what i gave was just to extract all line with CD patterns. The help that i got here is awesome and help me to learn and understand better. thanks a lot! .
Hi redse171,
I'm very glad that RudiC was able to help you find a solution to your problem. Note that if you need to use split() to correctly group your fields, you don't need to also use match() and substr() to determine whether you have a + or - in field 5 (you can just look directly at seg[5]) after you call split(). You can then simplify your code to something like:
and get the same results.
Hi redse171,
I'm very glad that RudiC was able to help you find a solution to your problem. Note that if you need to use split() to correctly group your fields, you don't need to also use match() and substr() to determine whether you have a + or - in field 5 (you can just look directly at seg[5]) after you call split(). You can then simplify your code to something like:
and get the same results.
Hope this helps,
Don
Hi Don,
It does help!.. It just that i need to add a tiny part (in blue) there at printing part or else it wont show $4 in my output.
Thanks a bunch
---------- Post updated at 09:32 AM ---------- Previous update was at 09:31 AM ----------
I am trying to get a simple min/max script to work with the below input. Note the special character (">") within it.
Script
awk 'BEGIN{max=0}{if(($1)>max) max=($1)}END {print max}'
awk 'BEGIN{min=0}{if(($2)<min) min=($2)}END {print min}'
Input
-122.2840 42.0009
-119.9950 ... (7 Replies)
I need to find the max/min of columns 1 and 2 of a 2 column file what contains the special character ">".
I know that this will find the max value of column 1.
awk 'BEGIN {max = 0} {if ($1>max) max=$1} END {print max}' input.file
But what if I needed to ignore special characters in the... (3 Replies)
aaa: 3 ms
aaa: 2 ms
aaa: 5 ms
aaa: 10 ms
..........
to get the 3 2 5 10 ...'s min avg and max
something like
min: 2 ms avg: 5 ms max: 10 ms (2 Replies)
Hi,
I have a file which looks like this:
FID IID MISS_PHENO N_MISS N_GENO F_MISS
12AB43131 12AB43131 N 17774 906341 0.01961
65HJ87451 65HJ87451 N 10149 906341 0.0112
43JJ21345 43JJ21345 N 2826 906341 0.003118I would... (11 Replies)
Hi guys,
I already search on the forum but i can't solve this on my own.
I have a lot of files like this:
And i need to print the line with the maximum value in last column but if the value is the same (2 in this exemple for the 3 last lines) i need get the line with the minimum value in... (4 Replies)
Hi guys!
I'm new to scripting and I need to write a script in awk.
Here is example of file on which I'm working
ATOM 4688 HG1 PRO A 322 18.080 59.680 137.020 1.00 0.00
ATOM 4689 HG2 PRO A 322 18.850 61.220 137.010 1.00 0.00
ATOM 4690 CD ... (18 Replies)
hi, i have an awk script and I managed to figure out how to search the max value but Im having difficulty in searching for the min field value.
BEGIN {FS=","; max=0}
NF == 7 {if (max < $6) max = $6;}
END { print man, min}
where $6 is the column of a field separated by a comma (3 Replies)
Hello every one, I have following data
***CAMPAIGN 1998 CONTRIBUTIONS***
---------------------------------------------------------------------------
NAME PHONE Jan | Feb | Mar | Total Donated
... (12 Replies)