I'll try to explain all awk and sed code in this thread:
Quote:
Originally Posted by vish_indian
if($0~"x= 1") $0 represents whole input line feeded to awk, here ~ is match operator, so it becomes, if x= 1 found anywhere in the input line then, change the value of flag variable to 1 and print the whole line, else{ if(flag==1) else if awk is unable to find x= 1 in the input line then check for the value of flag variable, if its equal to 1 that means previous line contained x= 1 somewhere in it, since we have to print 2 consecutive lines of input in case of a match in the first line, so print 2nd consecutive line also and a "\n" and change flag's value to 0 to repeat the above process.
Quote:
Originally Posted by shereenmotor
$0 ~ /^.*x= 1.*/ Match the input line for pattern x= 1, if match found, { print; getline; print $0"\n" } print command without any arguments prints the whole input line, so it'll print the line where pattern x= 1 is found, then getline command takes very next line into buffer and print $0"\n" will print that line and a "\n" charachter.
Quote:
Originally Posted by shereenmotor
sed -n '/^.*x= 1.*/-n flag will supress the sed's default printing, if regex /^.*x= 1.*/ found then N command will append the next line to the pattern space, G command will append a "\n" to the pattern space and p will print all the lines in pattern space and empty it.
Quote:
Originally Posted by anbu23
$1 ~ "^[0-9][0-9]*$" Check if first field of input line contains numerics[0-9] only, NF system variable tells us total no. of fields available in the input line, so $NF means value of last field, so whole expression($1 ~ "^[0-9][0-9]*$" && $NF == 0) states that check if first field contains numerics only and last field is equal to 0 then { grp1=grp1+$(NF-1); tot1=tot1+1 } $(NF-1) means value of 2nd last field, so it'll add value of 2nd last field to variable grp1, and group1's count (tot1) will be incremented, $1 ~ "^[0-9][0-9]*c$" it tells awk to match first field for the pattern containing numerics and ending in a c like 0001c in your case, rest all is same, and End rule is printing the average of both groups by dividing total of a group with its count.
I'll try to explain all awk and sed code in this thread:if($0~"x= 1") $0 represents whole input line feeded to awk, here ~ is match operator, so it becomes, if x= 1 found anywhere in the input line then, change the value of flag variable to 1 and print the whole line, else{ if(flag==1) else if awk is unable to find x= 1 in the input line then check for the value of flag variable, if its equal to 1 that means previous line contained x= 1 somewhere in it, since we have to print 2 consecutive lines of input in case of a match in the first line, so print 2nd consecutive line also and a "\n" and change flag's value to 0 to repeat the above process.$0 ~ /^.*x= 1.*/ Match the input line for pattern x= 1, if match found, { print; getline; print $0"\n" } print command without any arguments prints the whole input line, so it'll print the line where pattern x= 1 is found, then getline command takes very next line into buffer and print $0"\n" will print that line and a "\n" charachter.sed -n '/^.*x= 1.*/-n flag will supress the sed's default printing, if regex /^.*x= 1.*/ found then N command will append the next line to the pattern space, G command will append a "\n" to the pattern space and p will print all the lines in pattern space and empty it.$1 ~ "^[0-9][0-9]*$" Check if first field of input line contains numerics[0-9] only, NF system variable tells us total no. of fields available in the input line, so $NF means value of last field, so whole expression($1 ~ "^[0-9][0-9]*$" && $NF == 0) states that check if first field contains numerics only and last field is equal to 0 then { grp1=grp1+$(NF-1); tot1=tot1+1 } $(NF-1) means value of 2nd last field, so it'll add value of 2nd last field to variable grp1, and group1's count (tot1) will be incremented, $1 ~ "^[0-9][0-9]*c$" it tells awk to match first field for the pattern containing numerics and ending in a c like 0001c in your case, rest all is same, and End rule is printing the average of both groups by dividing total of a group with its count.
Regards,
Tayyab
Hi Tayyab,
You are such a professional at this!! Thanks alot for your very comprehensive explanation! Just another simple question for you. I notice tht sometimes you have this "^" character for eg in ($1 ~ "^[0-9][0-9]*$" && $NF == 0). What does this^trying to mean in the statement?
You are such a professional at this!! Thanks alot for your very comprehensive explanation! Just another simple question for you. I notice tht sometimes you have this "^" character for eg in ($1 ~ "^[0-9][0-9]*$" && $NF == 0). What does this^trying to mean in the statement?
^ means start of the string if it is specified at the start of the regular expression
$ means end of the string if it is specified at the end of the regular expression
I am trying to modify and understand an awk written by @Scrutinizer
The below awk will filter a list of 30,000 lines in the tab-delimited file. What I am having trouble with is adding a condition to SVTYPE=CNV
that will only print that line if CI=,0.95: portion in blue in file is <1.9.
The... (2 Replies)
Hello all, I am working on a file like below:
site Date time value1 value2
0023 2014-01-01 00:00 32.0 23.7
0023 2014-01-01 01:00 38.0 29.9
0023 2014-01-01 02:00 85.0 26.6
0023 2014-01-01 03:00 34.0 25.3
0023 2014-01-01 04:00 37.0 23.8
0023 2014-01-01 05:00 80.0 20.3
0023 2014-01-01 06:00... (16 Replies)
Hi Friends,
My input file
Gene1 10 20 0
Gene2 5 0 15
Gene3 10 10 10
Gene4 5 0 0
If there is a zero for any gene in any column, I don't want that column to be considered which reduces the denominator value during average.
Here is my output
Gene1 10 20 0 10
Gene2 5 0 15 10
Gene3... (5 Replies)
Hi Friends,
Hope all is well.
I have an input file like this
a gene1 10
b gene1 2
c gene2 20
c gene3 10
d gene4 5
e gene5 6
Steps to reach output.
1. Print unique values of column1 as column of the matrix, which will be
a
b
c (5 Replies)
I have a file some thing like this:
GN Name=YWHAB;
RC TISSUE=Keratinocyte;
RC TISSUE=Thymus;
CC -!- FUNCTION: Adapter protein implicated in the regulation of a large
CC spectrum of both general and specialized signaling pathways
GN Name=YWHAE;
RC TISSUE=Liver;
RC ... (13 Replies)
Hi,
This may not be the right forum but i am hoping someone knows an answer to this.
I have to capture rows for a column that was deleted. How can i do that without having to write a select query?
delete from myschema.mytable where currentdatetimestamp > columnDate
this should delete 5... (4 Replies)
there are 20 variables and I would like to delete the rows if 13th-20th columns are all NA.
Thank you!
FID IID aspirpre statihos fibrahos ocholhos arbhos betabhos alphbhos cacbhos diurehos numbcig.x toast1 toast2 toast3 toast4 ischoth1 ischoth2 ischoth3 ischoth4
101 101 1 1 1 1 1 2 1 2... (2 Replies)
Hi all,
I have list of two kind of files and I want to compare the rows and print the merged data by applying if condition.
First kind of file looks like:
and second kind of file looks like :
I want to print the rows present in second file followed by 3 more columns from first... (6 Replies)
hi,
i am a new user in unix..and we have unix db2. i want to capture the no. of rows updated by a update db2 sql statement and redirect into a log file.
I've seen db2 -m...but not sure how the syntax should be. The update sql that I'm going to run is from a file...
Can you please share... (1 Reply)