How to capture 2 consecutive rows when a condition is true ?

10-05-2006

Registered User

2,205, 181

Join Date: Mar 2006

Last Activity: 8 May 2020, 5:01 AM EDT

Location: Bangalore,India

Posts: 2,205

Thanks Given: 31

Thanked 181 Times in 171 Posts

try this

Code:

awk ' 
$1 ~ "^[0-9][0-9]*$" && $NF == 0   { grp1=grp1+$(NF-1); tot1=tot1+1 }
$1 ~ "^[0-9][0-9]*c$" && $NF == 0 { grp2=grp2+$(NF-1); tot2=tot2+1 }
END {
     printf("Group1\nAverage = %f \n",grp1/tot1)
     printf("Group2c\nAverage = %f\n",grp2/tot2)
}' file

Last edited by anbu23; 10-05-2006 at 01:13 PM..

anbu23

View Public Profile for anbu23

Find all posts by anbu23

10-05-2006

Registered User

2,669, 20

Join Date: Sep 2006

Last Activity: 28 January 2015, 8:30 AM EST

Posts: 2,669

Thanks Given: 0

Thanked 20 Times in 20 Posts

Code:

average_c = [] #to catch "c" groups
average  = []  #to catch non c groups
data = open("test.txt").readlines()
for i in range(len(data)):
        if "x= 1" in data[i]:
		nextline_splitted = data[i+1].split()
		if nextline_splitted[0].endswith("c"):
			 average_c.append( int( nextline_splitted[-2] ) )
		else:
			average.append( nextline_splitted[-2] ) 

print "Group 1: " , ''.join(average)
print "Group 2: " , sum(average_c)/2

Output:

Code:

D:\yhlee\test>python test.py
Group 1:  2826
Group 2:  3804

Last edited by ghostdog74; 10-07-2006 at 04:32 AM..

ghostdog74

View Public Profile for ghostdog74

Find all posts by ghostdog74

10-07-2006

Forum Advisor

600, 12

Join Date: Nov 2004

Last Activity: 26 August 2019, 7:27 AM EDT

Location: Oman

Posts: 600

Thanks Given: 15

Thanked 12 Times in 5 Posts

I'll try to explain all awk and sed code in this thread:

Quote:

Originally Posted by vish_indian

Code:

awk '{if($0~"x= 1"){flag=1; print} else{ if(flag==1){ print $0"\n"}; flag=0}}' awtest

if($0~"x= 1") $0 represents whole input line feeded to awk, here ~ is match operator, so it becomes, if x= 1 found anywhere in the input line then, change the value of flag variable to 1 and print the whole line, else{ if(flag==1) else if awk is unable to find x= 1 in the input line then check for the value of flag variable, if its equal to 1 that means previous line contained x= 1 somewhere in it, since we have to print 2 consecutive lines of input in case of a match in the first line, so print 2nd consecutive line also and a "\n" and change flag's value to 0 to repeat the above process.

Quote:

Originally Posted by shereenmotor

Code:

awk '$0 ~ /^.*x= 1.*/{ print; getline; print $0"\n" }' awtest

$0 ~ /^.*x= 1.*/ Match the input line for pattern x= 1, if match found, { print; getline; print $0"\n" } print command without any arguments prints the whole input line, so it'll print the line where pattern x= 1 is found, then getline command takes very next line into buffer and print $0"\n" will print that line and a "\n" charachter.

Quote:

Originally Posted by shereenmotor

Code:

sed -n '/^.*x= 1.*/{N;G;p;}' awtest

sed -n '/^.*x= 1.*/-n flag will supress the sed's default printing, if regex /^.*x= 1.*/ found then N command will append the next line to the pattern space, G command will append a "\n" to the pattern space and p will print all the lines in pattern space and empty it.

Quote:

Originally Posted by anbu23

Code:

awk ' 
$1 ~ "^[0-9][0-9]*$" && $NF == 0   { grp1=grp1+$(NF-1); tot1=tot1+1 }
$1 ~ "^[0-9][0-9]*c$" && $NF == 0 { grp2=grp2+$(NF-1); tot2=tot2+1 }
END {
     printf("Group1\nAverage = %f \n",grp1/tot1)
     printf("Group2c\nAverage = %f\n",grp2/tot2)
}' file

$1 ~ "^[0-9][0-9]*$" Check if first field of input line contains numerics[0-9] only, NF system variable tells us total no. of fields available in the input line, so $NF means value of last field, so whole expression($1 ~ "^[0-9][0-9]*$" && $NF == 0) states that check if first field contains numerics only and last field is equal to 0 then { grp1=grp1+$(NF-1); tot1=tot1+1 } $(NF-1) means value of 2nd last field, so it'll add value of 2nd last field to variable grp1, and group1's count (tot1) will be incremented, $1 ~ "^[0-9][0-9]*c$" it tells awk to match first field for the pattern containing numerics and ending in a c like 0001c in your case, rest all is same, and End rule is printing the average of both groups by dividing total of a group with its count.

Regards,
Tayyab

tayyabq8

View Public Profile for tayyabq8

Find all posts by tayyabq8

10-08-2006

Registered User

353, 2

Join Date: Sep 2006

Last Activity: 18 April 2011, 3:56 AM EDT

Location: Sg

Posts: 353

Thanks Given: 0

Thanked 2 Times in 2 Posts

Quote:

Originally Posted by shereenmotor

I'll try to explain all awk and sed code in this thread:if($0~"x= 1") $0 represents whole input line feeded to awk, here ~ is match operator, so it becomes, if x= 1 found anywhere in the input line then, change the value of flag variable to 1 and print the whole line, else{ if(flag==1) else if awk is unable to find x= 1 in the input line then check for the value of flag variable, if its equal to 1 that means previous line contained x= 1 somewhere in it, since we have to print 2 consecutive lines of input in case of a match in the first line, so print 2nd consecutive line also and a "\n" and change flag's value to 0 to repeat the above process.$0 ~ /^.*x= 1.*/ Match the input line for pattern x= 1, if match found, { print; getline; print $0"\n" } print command without any arguments prints the whole input line, so it'll print the line where pattern x= 1 is found, then getline command takes very next line into buffer and print $0"\n" will print that line and a "\n" charachter.sed -n '/^.*x= 1.*/-n flag will supress the sed's default printing, if regex /^.*x= 1.*/ found then N command will append the next line to the pattern space, G command will append a "\n" to the pattern space and p will print all the lines in pattern space and empty it.$1 ~ "^[0-9][0-9]*$" Check if first field of input line contains numerics[0-9] only, NF system variable tells us total no. of fields available in the input line, so $NF means value of last field, so whole expression($1 ~ "^[0-9][0-9]*$" && $NF == 0) states that check if first field contains numerics only and last field is equal to 0 then { grp1=grp1+$(NF-1); tot1=tot1+1 } $(NF-1) means value of 2nd last field, so it'll add value of 2nd last field to variable grp1, and group1's count (tot1) will be incremented, $1 ~ "^[0-9][0-9]*c$" it tells awk to match first field for the pattern containing numerics and ending in a c like 0001c in your case, rest all is same, and End rule is printing the average of both groups by dividing total of a group with its count.

Regards,
Tayyab

Hi Tayyab,

You are such a professional at this!! Thanks alot for your very comprehensive explanation! Just another simple question for you. I notice tht sometimes you have this "^" character for eg in ($1 ~ "^[0-9][0-9]*$" && $NF == 0). What does this^trying to mean in the statement?

Raynon

View Public Profile for Raynon

Find all posts by Raynon

10-08-2006

Registered User

2,205, 181

Join Date: Mar 2006

Last Activity: 8 May 2020, 5:01 AM EDT

Location: Bangalore,India

Posts: 2,205

Thanks Given: 31

Thanked 181 Times in 171 Posts

Quote:

Originally Posted by Raynon

^ means start of the string if it is specified at the start of the regular expression

$ means end of the string if it is specified at the end of the regular expression

anbu23

View Public Profile for anbu23

Find all posts by anbu23

10-30-2006

Registered User

353, 2

Join Date: Sep 2006

Last Activity: 18 April 2011, 3:56 AM EDT

Location: Sg

Posts: 353

Thanks Given: 0

Thanked 2 Times in 2 Posts

Quote:

Originally Posted by anbu23

try this

Code:

awk ' 
$1 ~ "^[0-9][0-9]*$" && $NF == 0   { grp1=grp1+$(NF-1); tot1=tot1+1 }
$1 ~ "^[0-9][0-9]*c$" && $NF == 0 { grp2=grp2+$(NF-1); tot2=tot2+1 }
END {
     printf("Group1\nAverage = %f \n",grp1/tot1)
     printf("Group2c\nAverage = %f\n",grp2/tot2)
}' file

Hi, I have some problems executing this code.

Error msg is
$ testing1
Unmatched '

My code is below exactly the same as above and is working in a solaris environment. PLs help.

#!/bin/csh

nawk '
$1 ~ "^[0-9][0-9]*$" && $NF == 0 { grp1=grp1+$(NF-1); tot1=tot1+1 }
$1 ~ "^[0-9][0-9]*c$" && $NF == 0 { grp2=grp2+$(NF-1); tot2=tot2+1 }
END {
printf("Group1\nAverage = %f \n",grp1/tot1)
printf("Group2c\nAverage = %f\n",grp2/tot2)
}' xxx.txt

Raynon

View Public Profile for Raynon

Find all posts by Raynon

10-30-2006

Registered User

2,205, 181

Join Date: Mar 2006

Last Activity: 8 May 2020, 5:01 AM EDT

Location: Bangalore,India

Posts: 2,205

Thanks Given: 31

Thanked 181 Times in 171 Posts

awk '
$1 ~ "^[0-9][0-9]*$" && $NF == 0 { grp1=grp1+$(NF-1); tot1=tot1+1 }
$1 ~ "^[0-9][0-9]*c$" && $NF == 0 { grp2=grp2+$(NF-1); tot2=tot2+1 }
END {
printf("Group1\nAverage = %f \n",grp1/tot1)
printf("Group2c\nAverage = %f\n",grp2/tot2)
}' file

I think you might have missed above '.

anbu23

View Public Profile for anbu23

Find all posts by anbu23

Shell Programming and Scripting

How to capture 2 consecutive rows when a condition is true ?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to capture lines that meet either condition

Discussion started by: cmccabe

2. Shell Programming and Scripting

Compute value from more than three consecutive rows

Discussion started by: kathy wang

3. Shell Programming and Scripting

Average across rows with a condition

Discussion started by: jacobs.smith

4. Shell Programming and Scripting

Columns to Rows - Transpose - Special Condition

Discussion started by: jacobs.smith

5. Shell Programming and Scripting

Convert rows to columns based on condition

Discussion started by: raj_k

6. Shell Programming and Scripting

Capture rows for a column in file from delete sql -Oracle

Discussion started by: jakSun8

7. Shell Programming and Scripting

deleting rows under a certain condition

Discussion started by: johnkim0806

8. Shell Programming and Scripting

Print merged rows from two files by applying if condition

Discussion started by: CAch

9. Shell Programming and Scripting

remove consecutive duplicate rows

Discussion started by: LMHmedchem

10. UNIX for Dummies Questions & Answers

how to capture no. of rows updated in update sql in unix db2

Discussion started by: j_rymbei