Find smallest between replicates ID

07-23-2014

Registered User

77, 2

Join Date: Nov 2012

Last Activity: 5 January 2018, 7:15 AM EST

Posts: 77

Thanks Given: 45

Thanked 2 Times in 2 Posts

Find smallest between replicates ID

Hi All
I need to find the smallest values between replicates id (column1)
Input file:

Code:

a name1 1200
a name2 800
b name1 100
b name2 150
b name3 4

output:

Code:

a name2 800
b name3 4

Do you have any suggestion?

Thank you!

giuliangiuseppe

View Public Profile for giuliangiuseppe

Find all posts by giuliangiuseppe

07-23-2014

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Given what you have learned from your earlier thread Output minimum and maximum values for replicates ID, what have you tried to solve this problem on your own?

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

07-23-2014

Registered User

77, 2

Join Date: Nov 2012

Last Activity: 5 January 2018, 7:15 AM EST

Posts: 77

Thanks Given: 45

Thanked 2 Times in 2 Posts

Hi Don Cragun and thank you for your reply!
unfortunately the command of the previous post does not work (I resolve the issue from my own with completely different approach).

the command was

Code:

awk '{idx=$1 FS $2}FNR==1{a3[idx]=$3}{a3[idx]=(a3[idx]>$3)?a3[idx]:$3;a4[idx]=($4>a4[idx])?$4:a4[idx]} END{for(i in a3)print i,a3[i],a4[i]}' myFile

with my File:

Code:

a x 1 4
a x 2 5
b x 5 10
b x 6 12
c x 8 15
c x 6 12

the output is:

Code:

a x 2 5
b x 6 12
c x 8 15

As you can see in the column 3 is not reported the smalles value.
I try to change a little bit the command without success.

Giuliano

giuliangiuseppe

View Public Profile for giuliangiuseppe

Find all posts by giuliangiuseppe

07-23-2014

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Yes. In your previous thread you wanted to print the maximum value for the 4th column and the minimum value for the 3rd column. Now you have an easier job; you just want to print the line that has the minimum value for the 3rd column (and there is no 4th column).

How did you try to change that code to get what you need for this problem?

What did it do?

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

07-23-2014

Registered User

77, 2

Join Date: Nov 2012

Last Activity: 5 January 2018, 7:15 AM EST

Posts: 77

Thanks Given: 45

Thanked 2 Times in 2 Posts

I tried this one(suppose file with 2 column, first column ID)

Code:

awk '{idx=$1}FNR==1{a3[idx]=$2}{a3[idx]=(a3[idx]>$2)?$2:a3[idx]} END{for(i in a3)print i,a3[i]}' myFile

But I have some problem because the command just output the first lane!

giuliangiuseppe

View Public Profile for giuliangiuseppe

Find all posts by giuliangiuseppe

07-23-2014

Moderator

1,837, 668

Join Date: Nov 2012

Last Activity: 30 June 2020, 12:07 PM EDT

Posts: 1,837

Thanks Given: 180

Thanked 668 Times in 590 Posts

This might help you

Code:

awk '{ 
	# duplicate is column1
	col = $1
	
	# value to be compared is from column3
	value = $3

	# Here we track for duplicate records
	rep[col]++

      }
      {
	# if column is not in array meaning array does not have index col so far
        # or column in array meaning index col is exists in array a but
	# array element is greater than current line value ($3) then 
	# modify array a 
	if(!(col in a) || ( col in a && a[col] > value))
	{
		a[col] = value
	
		# Here we set o/p required you can also write $1 OFS $2 etc
		# Used in end block
		output[value] = $0 
	}

      }
   END{
	# Loop throuh rep array
	for(i in rep)
	{
		# if array elements is greater then 1 then its duplicate 
		# so print contents from array output 
		# where index being element of array a 
		# array a index is current index i
		if(rep[i]>1 )
			print output[a[i]]
	}
      }'    file

This User Gave Thanks to Akshay Hegde For This Post:

Akshay Hegde

View Public Profile for Akshay Hegde

Find all posts by Akshay Hegde

07-23-2014

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Quote:

Originally Posted by giuliangiuseppe

I tried this one(suppose file with 2 column, first column ID)

Code:

awk '{idx=$1}FNR==1{a3[idx]=$2}{a3[idx]=(a3[idx]>$2)?$2:a3[idx]} END{for(i in a3)print i,a3[i]}' myFile

But I have some problem because the command just output the first lane!

The code marked in red above (which is the only portion of your code that adds elements to the array a3) is only executed when FNR==1 (i.e., only when you are looking at the 1st line of the current input file). So, when you print the array at the end, only that one element is found.

The following uses similar logic to the code provided by Akshay Hegde, but will also print a line for keys that only appear once in your input file:

Code:

awk '
!($1 in d) || f3[$1] > $3 {
	d[$1] = $0
	f3[$1] = $3
}
END {	for(i in d)
		print d[i]
}' myFile

which produces:

Code:

a name2 800
b name3 4

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk.

These 2 Users Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

Shell Programming and Scripting

Find smallest between replicates ID

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with keep the smallest record in file

Discussion started by: perl_beginner

2. Shell Programming and Scripting

How to write program that find winner who choose the smallest number. UNIX process?

Discussion started by: dantesma

3. Shell Programming and Scripting

Merge row based on replicates ID

Discussion started by: giuliangiuseppe

4. Shell Programming and Scripting

Find larger on replicates and output

Discussion started by: giuliangiuseppe

5. Shell Programming and Scripting

Find biggest values on replicates

Discussion started by: giuliangiuseppe

6. Shell Programming and Scripting

Output minimum and maximum values for replicates ID

Discussion started by: giuliangiuseppe

7. Shell Programming and Scripting

Find smallest & largest in every column

Discussion started by: attila

8. Shell Programming and Scripting

Find the smallest block

Discussion started by: nexional

9. Programming

Help with find highest and smallest number in a file with c

Discussion started by: cpp_beginner

10. Shell Programming and Scripting

AWK (how) to get smallest/largest nr of ls -la

Discussion started by: abciscool