awk - mixed for and if to select particular lines in a data file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk - mixed for and if to select particular lines in a data file
# 1  
Old 07-05-2013
awk - mixed for and if to select particular lines in a data file

Hi all,

I am new to AWK and I am trying to solve a problem that is probably easy for an expert. Suppose I have the following data file input.txt:
Code:
20 35 43
20 23 54
20 62 21

20.5 43 12
20.5 33 11
20.5 89 87

21 33 20
21 22 21
21 56 87

I want to select from all lines having the first column equal value the particular line with the minimum of the second column value. That is to say I would like that the AWK script would be able to produce the following file output.txt:
Code:
20 23 54
20.5 33 11
21 22 21

I have already try to find an answer on many forums but without success. Can you help me?

Last edited by vbe; 07-05-2013 at 01:24 PM.. Reason: code tags
# 2  
Old 07-05-2013
The following produces the output you requested but the order of the output is unspecified:
Code:
awk '
NF < 2 {next
}
!($1 in m) || m[$1] > $2 {
        m[$1] = $2
        o[$1] = $0
}
END {   for(i in o) print o[i]
}' input.txt

If you are using a Solaris system, use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of /usr/bin/awk or /bin/awk.

Your sample has all of the input with a given 1st column value grouped together, but your statement of requirements didn't say anything about this. The code above accepts input in any order.

If your input always has all lines with the same 1st column value on adjacent input lines, this script can be rewritten to produce output when the 1st column value changes. This would take fewer resources for large input files and would produce output in the same order as the input.
# 3  
Old 07-05-2013
Try:
Code:
sort file | awk '!a[$1]++'

# 4  
Old 07-05-2013
Quote:
Originally Posted by bartus11
Try:
Code:
sort file | awk '!a[$1]++'

This provides an easy way to group 1st field values together, but it also produces an empty output line that the OP doesn't seem to want and it will only work correctly if all 2nd field values in each group have the same number of digits before the decimal point (if a decimal point occurs in any 2nd field value within a group) and have no leading plus-signs (+) unless all non-negative values in a group have a leading plus-sign. The last part of this can be fixed trivially by adding the -n option to sort. Getting rid of the blank line is also easy (if it matters):
Code:
sort -n input.txt | awk 'NF > 1 && !a[$1]++'

# 5  
Old 07-05-2013
Another approach, assuming grouped values in the first column, but maintaining order of input file:
Code:
awk '!NF{next} p!=$1{if(s)print s; m=$2} $2<m{m=$2; s=$0}{p=$1} END{print s}' file


Last edited by Scrutinizer; 07-05-2013 at 04:06 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to select 2D data bins

I wish to use AWK to do something akin: Select all 2D data with 1<$1<2 and -7.5<$2<-6.5 But it's not working awk 'END {print ($1<=2&&$1>=1&&$2<=-6.5&&$2>=-7.5)}' bla Data: -1.06897 -8.04482 -61.469 -1.13613 -8.04482 -61.2271 -1.00182 -8.04482 -61.2081 -1.06897 -8.13518 -60.8544... (2 Replies)
Discussion started by: chrisjorg
2 Replies

2. Shell Programming and Scripting

awk to select lines with maximum value of each record based on column value

Hello, I want to get the maximum value of each record separated by empty line based on the 3rd column of each row within each record? Input: A1 chr5D 634 7 82 707 A2 chr5D 637 6 82 713 A3 chr5D 637 5 82 713 A4 chr5D 626 1 82 704... (4 Replies)
Discussion started by: yifangt
4 Replies

3. Shell Programming and Scripting

Select only the lines of a file starting with a field which is matcing a list. awk?

Hello I have a large file1 which has many events like "2014010420" and following lines under each event that start with text . It has this form: 2014010420 num --- --- num .... NTE num num --- num... EFA num num --- num ... LASW num num --- num... (9 Replies)
Discussion started by: phaethon
9 Replies

4. Shell Programming and Scripting

Using awk to parse a file with mixed formats in columns

Greetings I have a file formatted like this: rhino grey weight=1003;height=231;class=heaviest;histology=9,0,0,8 bird white weight=23;height=88;class=light;histology=7,5,1,0,0 turtle green weight=40;height=9;class=light;histology=6,0,2,0... (2 Replies)
Discussion started by: Twinklefingers
2 Replies

5. UNIX for Advanced & Expert Users

Sort mixed data file

I have a text file and each field is separated by semicolon ( ; ). Field number 7 is internally separated by comma ( , ) and pipe ( | ) symbol. I want to sort file based on three different fields which are marked in BOLD. Here first BOLD field will have numbers upto the length of 9 characters,... (6 Replies)
Discussion started by: jnrohit2k
6 Replies

6. Shell Programming and Scripting

select lines with certain values on certain fields with awk

I need a awk command to select from a log-file only the lines that have on the 2nd field (considering "|" separator) one of the values 10.216.22.XX or 10.216.22.YY or 10.216.22.ZZ and on the 4th field only values that contain strictly digits. I want the command to work parsing the file only once (I... (2 Replies)
Discussion started by: black_fender
2 Replies

7. Shell Programming and Scripting

Extracting specific lines of data from a file and related lines of data based on a grep value range?

Hi, I have one file, say file 1, that has data like below where 19900107 is the date, 19900107 12 144 129 0.7380047 19900108 12 168 129 0.3149017 19900109 12 192 129 3.2766666E-02 ... (3 Replies)
Discussion started by: Wynner
3 Replies

8. Shell Programming and Scripting

Select lines in which column have value greater than some percent of total file lines

i have a file in following format 1 32 3 4 6 4 4 45 1 45 4 61 54 66 4 5 65 51 56 65 1 12 32 85 now here the total number of lines are 8(they vary each time) Now i want to select only those lines in which the values... (6 Replies)
Discussion started by: vaibhavkorde
6 Replies

9. Shell Programming and Scripting

Select some lines from a txt file and create a new file with awk

Hi there, I have a text file with several colums separated by "|;#" I need to search the file extracting all columns starting with the value of "1" or "2" saving in a separate file just the first 7 columns of each row maching the criteria, with replacement of the saparators in the nearly created... (4 Replies)
Discussion started by: capnino
4 Replies

10. Shell Programming and Scripting

Script needed to select and delete lower case and mixed case records

HELLO ALL, URGENTLY NEEDED A SCRIPT TO SELECT AND DELETE LOWER AND MIXED CASE RECORDS FROM A COLUMN IN A TABLE. FOR EXAMPLE : Table name is EMPLOYEE and the column name is CITY and the CITY column records will be: Newyork washington ... (1 Reply)
Discussion started by: abhilash mn
1 Replies
Login or Register to Ask a Question