How to capture 2 consecutive rows when a condition is true ?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to capture 2 consecutive rows when a condition is true ?
# 8  
Old 10-05-2006
try this
Code:
awk ' 
$1 ~ "^[0-9][0-9]*$" && $NF == 0   { grp1=grp1+$(NF-1); tot1=tot1+1 }
$1 ~ "^[0-9][0-9]*c$" && $NF == 0 { grp2=grp2+$(NF-1); tot2=tot2+1 }
END {
     printf("Group1\nAverage = %f \n",grp1/tot1)
     printf("Group2c\nAverage = %f\n",grp2/tot2)
}' file


Last edited by anbu23; 10-05-2006 at 01:13 PM..
# 9  
Old 10-05-2006
Code:
average_c = [] #to catch "c" groups
average  = []  #to catch non c groups
data = open("test.txt").readlines()
for i in range(len(data)):
        if "x= 1" in data[i]:
		nextline_splitted = data[i+1].split()
		if nextline_splitted[0].endswith("c"):
			 average_c.append( int( nextline_splitted[-2] ) )
		else:
			average.append( nextline_splitted[-2] ) 

print "Group 1: " , ''.join(average)
print "Group 2: " , sum(average_c)/2

Output:
Code:
D:\yhlee\test>python test.py
Group 1:  2826
Group 2:  3804


Last edited by ghostdog74; 10-07-2006 at 04:32 AM..
# 10  
Old 10-07-2006
I'll try to explain all awk and sed code in this thread:
Quote:
Originally Posted by vish_indian
Code:
awk '{if($0~"x= 1"){flag=1; print} else{ if(flag==1){ print $0"\n"}; flag=0}}' awtest

if($0~"x= 1") $0 represents whole input line feeded to awk, here ~ is match operator, so it becomes, if x= 1 found anywhere in the input line then, change the value of flag variable to 1 and print the whole line, else{ if(flag==1) else if awk is unable to find x= 1 in the input line then check for the value of flag variable, if its equal to 1 that means previous line contained x= 1 somewhere in it, since we have to print 2 consecutive lines of input in case of a match in the first line, so print 2nd consecutive line also and a "\n" and change flag's value to 0 to repeat the above process.
Quote:
Originally Posted by shereenmotor
Code:
awk '$0 ~ /^.*x= 1.*/{ print; getline; print $0"\n" }' awtest

$0 ~ /^.*x= 1.*/ Match the input line for pattern x= 1, if match found, { print; getline; print $0"\n" } print command without any arguments prints the whole input line, so it'll print the line where pattern x= 1 is found, then getline command takes very next line into buffer and print $0"\n" will print that line and a "\n" charachter.
Quote:
Originally Posted by shereenmotor
Code:
sed -n '/^.*x= 1.*/{N;G;p;}' awtest

sed -n '/^.*x= 1.*/-n flag will supress the sed's default printing, if regex /^.*x= 1.*/ found then N command will append the next line to the pattern space, G command will append a "\n" to the pattern space and p will print all the lines in pattern space and empty it.
Quote:
Originally Posted by anbu23
Code:
awk ' 
$1 ~ "^[0-9][0-9]*$" && $NF == 0   { grp1=grp1+$(NF-1); tot1=tot1+1 }
$1 ~ "^[0-9][0-9]*c$" && $NF == 0 { grp2=grp2+$(NF-1); tot2=tot2+1 }
END {
     printf("Group1\nAverage = %f \n",grp1/tot1)
     printf("Group2c\nAverage = %f\n",grp2/tot2)
}' file

$1 ~ "^[0-9][0-9]*$" Check if first field of input line contains numerics[0-9] only, NF system variable tells us total no. of fields available in the input line, so $NF means value of last field, so whole expression($1 ~ "^[0-9][0-9]*$" && $NF == 0) states that check if first field contains numerics only and last field is equal to 0 then { grp1=grp1+$(NF-1); tot1=tot1+1 } $(NF-1) means value of 2nd last field, so it'll add value of 2nd last field to variable grp1, and group1's count (tot1) will be incremented, $1 ~ "^[0-9][0-9]*c$" it tells awk to match first field for the pattern containing numerics and ending in a c like 0001c in your case, rest all is same, and End rule is printing the average of both groups by dividing total of a group with its count.

Regards,
Tayyab
# 11  
Old 10-08-2006
Quote:
Originally Posted by shereenmotor
I'll try to explain all awk and sed code in this thread:if($0~"x= 1") $0 represents whole input line feeded to awk, here ~ is match operator, so it becomes, if x= 1 found anywhere in the input line then, change the value of flag variable to 1 and print the whole line, else{ if(flag==1) else if awk is unable to find x= 1 in the input line then check for the value of flag variable, if its equal to 1 that means previous line contained x= 1 somewhere in it, since we have to print 2 consecutive lines of input in case of a match in the first line, so print 2nd consecutive line also and a "\n" and change flag's value to 0 to repeat the above process.$0 ~ /^.*x= 1.*/ Match the input line for pattern x= 1, if match found, { print; getline; print $0"\n" } print command without any arguments prints the whole input line, so it'll print the line where pattern x= 1 is found, then getline command takes very next line into buffer and print $0"\n" will print that line and a "\n" charachter.sed -n '/^.*x= 1.*/-n flag will supress the sed's default printing, if regex /^.*x= 1.*/ found then N command will append the next line to the pattern space, G command will append a "\n" to the pattern space and p will print all the lines in pattern space and empty it.$1 ~ "^[0-9][0-9]*$" Check if first field of input line contains numerics[0-9] only, NF system variable tells us total no. of fields available in the input line, so $NF means value of last field, so whole expression($1 ~ "^[0-9][0-9]*$" && $NF == 0) states that check if first field contains numerics only and last field is equal to 0 then { grp1=grp1+$(NF-1); tot1=tot1+1 } $(NF-1) means value of 2nd last field, so it'll add value of 2nd last field to variable grp1, and group1's count (tot1) will be incremented, $1 ~ "^[0-9][0-9]*c$" it tells awk to match first field for the pattern containing numerics and ending in a c like 0001c in your case, rest all is same, and End rule is printing the average of both groups by dividing total of a group with its count.

Regards,
Tayyab
Hi Tayyab,

You are such a professional at this!! Thanks alot for your very comprehensive explanation! Just another simple question for you. I notice tht sometimes you have this "^" character for eg in ($1 ~ "^[0-9][0-9]*$" && $NF == 0). What does this^trying to mean in the statement?
# 12  
Old 10-08-2006
Quote:
Originally Posted by Raynon
Hi Tayyab,

You are such a professional at this!! Thanks alot for your very comprehensive explanation! Just another simple question for you. I notice tht sometimes you have this "^" character for eg in ($1 ~ "^[0-9][0-9]*$" && $NF == 0). What does this^trying to mean in the statement?
^ means start of the string if it is specified at the start of the regular expression

$ means end of the string if it is specified at the end of the regular expression
# 13  
Old 10-30-2006
Quote:
Originally Posted by anbu23
try this
Code:
awk ' 
$1 ~ "^[0-9][0-9]*$" && $NF == 0   { grp1=grp1+$(NF-1); tot1=tot1+1 }
$1 ~ "^[0-9][0-9]*c$" && $NF == 0 { grp2=grp2+$(NF-1); tot2=tot2+1 }
END {
     printf("Group1\nAverage = %f \n",grp1/tot1)
     printf("Group2c\nAverage = %f\n",grp2/tot2)
}' file


Hi, I have some problems executing this code.

Error msg is
$ testing1
Unmatched '


My code is below exactly the same as above and is working in a solaris environment. PLs help.

#!/bin/csh

nawk '
$1 ~ "^[0-9][0-9]*$" && $NF == 0 { grp1=grp1+$(NF-1); tot1=tot1+1 }
$1 ~ "^[0-9][0-9]*c$" && $NF == 0 { grp2=grp2+$(NF-1); tot2=tot2+1 }
END {
printf("Group1\nAverage = %f \n",grp1/tot1)
printf("Group2c\nAverage = %f\n",grp2/tot2)
}' xxx.txt
# 14  
Old 10-30-2006
awk '
$1 ~ "^[0-9][0-9]*$" && $NF == 0 { grp1=grp1+$(NF-1); tot1=tot1+1 }
$1 ~ "^[0-9][0-9]*c$" && $NF == 0 { grp2=grp2+$(NF-1); tot2=tot2+1 }
END {
printf("Group1\nAverage = %f \n",grp1/tot1)
printf("Group2c\nAverage = %f\n",grp2/tot2)
}' file

I think you might have missed above '.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to capture lines that meet either condition

I am trying to modify and understand an awk written by @Scrutinizer The below awk will filter a list of 30,000 lines in the tab-delimited file. What I am having trouble with is adding a condition to SVTYPE=CNV that will only print that line if CI=,0.95: portion in blue in file is <1.9. The... (2 Replies)
Discussion started by: cmccabe
2 Replies

2. Shell Programming and Scripting

Compute value from more than three consecutive rows

Hello all, I am working on a file like below: site Date time value1 value2 0023 2014-01-01 00:00 32.0 23.7 0023 2014-01-01 01:00 38.0 29.9 0023 2014-01-01 02:00 85.0 26.6 0023 2014-01-01 03:00 34.0 25.3 0023 2014-01-01 04:00 37.0 23.8 0023 2014-01-01 05:00 80.0 20.3 0023 2014-01-01 06:00... (16 Replies)
Discussion started by: kathy wang
16 Replies

3. Shell Programming and Scripting

Average across rows with a condition

Hi Friends, My input file Gene1 10 20 0 Gene2 5 0 15 Gene3 10 10 10 Gene4 5 0 0 If there is a zero for any gene in any column, I don't want that column to be considered which reduces the denominator value during average. Here is my output Gene1 10 20 0 10 Gene2 5 0 15 10 Gene3... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

4. Shell Programming and Scripting

Columns to Rows - Transpose - Special Condition

Hi Friends, Hope all is well. I have an input file like this a gene1 10 b gene1 2 c gene2 20 c gene3 10 d gene4 5 e gene5 6 Steps to reach output. 1. Print unique values of column1 as column of the matrix, which will be a b c (5 Replies)
Discussion started by: jacobs.smith
5 Replies

5. Shell Programming and Scripting

Convert rows to columns based on condition

I have a file some thing like this: GN Name=YWHAB; RC TISSUE=Keratinocyte; RC TISSUE=Thymus; CC -!- FUNCTION: Adapter protein implicated in the regulation of a large CC spectrum of both general and specialized signaling pathways GN Name=YWHAE; RC TISSUE=Liver; RC ... (13 Replies)
Discussion started by: raj_k
13 Replies

6. Shell Programming and Scripting

Capture rows for a column in file from delete sql -Oracle

Hi, This may not be the right forum but i am hoping someone knows an answer to this. I have to capture rows for a column that was deleted. How can i do that without having to write a select query? delete from myschema.mytable where currentdatetimestamp > columnDate this should delete 5... (4 Replies)
Discussion started by: jakSun8
4 Replies

7. Shell Programming and Scripting

deleting rows under a certain condition

there are 20 variables and I would like to delete the rows if 13th-20th columns are all NA. Thank you! FID IID aspirpre statihos fibrahos ocholhos arbhos betabhos alphbhos cacbhos diurehos numbcig.x toast1 toast2 toast3 toast4 ischoth1 ischoth2 ischoth3 ischoth4 101 101 1 1 1 1 1 2 1 2... (2 Replies)
Discussion started by: johnkim0806
2 Replies

8. Shell Programming and Scripting

Print merged rows from two files by applying if condition

Hi all, I have list of two kind of files and I want to compare the rows and print the merged data by applying if condition. First kind of file looks like: and second kind of file looks like : I want to print the rows present in second file followed by 3 more columns from first... (6 Replies)
Discussion started by: CAch
6 Replies

9. Shell Programming and Scripting

remove consecutive duplicate rows

I have some data that looks like, 1 3300665.mol 3300665 5177008 102.093 2 3300665.mol 3300665 5177008 102.093 3 3294015.mol 3294015 5131552 102.114 4 3294015.mol 3294015 5131552 102.114 5 3293734.mol 3293734 5129625 104.152 6 3293734.mol ... (13 Replies)
Discussion started by: LMHmedchem
13 Replies

10. UNIX for Dummies Questions & Answers

how to capture no. of rows updated in update sql in unix db2

hi, i am a new user in unix..and we have unix db2. i want to capture the no. of rows updated by a update db2 sql statement and redirect into a log file. I've seen db2 -m...but not sure how the syntax should be. The update sql that I'm going to run is from a file... Can you please share... (1 Reply)
Discussion started by: j_rymbei
1 Replies
Login or Register to Ask a Question