awk script: loop through array


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk script: loop through array
# 1  
Old 11-08-2013
awk script: loop through array

I have a large file where I want to extract the data by using awk script. I have made a small sample of the input data. I have in the awk script two condition . The first one is to collect the initial time and the second one to collect the end time. I stored the difference between (Time=end-start) in an array. However, when I'm trying to loop through the array to print the results, I'm not getting the difference correct. It seems the code storing the end time in the Time array instead of the difference.

here is the sample ;
Code:
+ 0.1 5 0 tcp 40 ------- 1 5.0 2.0 0 0
- 0.1 5 0 tcp 40 ------- 1 5.0 2.0 0 0
r 0.102032 5 0 tcp 40 ------- 1 5.0 2.0 0 0
+ 0.102032 0 1 tcp 40 ------- 1 5.0 2.0 0 0
- 0.102032 0 1 tcp 40 ------- 1 5.0 2.0 0 0
r 0.108096 0 1 tcp 40 ------- 1 5.0 2.0 0 0
+ 0.108096 1 2 tcp 40 ------- 1 5.0 2.0 0 0
- 0.108096 1 2 tcp 40 ------- 1 5.0 2.0 0 0
r 0.110128 1 2 tcp 40 ------- 1 5.0 2.0 0 0
+ 0.110128 2 1 ack 40 ------- 1 2.0 5.0 0 2
- 0.110128 2 1 ack 40 ------- 1 2.0 5.0 0 2
r 0.11216 2 1 ack 40 ------- 1 2.0 5.0 0 2
+ 0.11216 1 0 ack 40 ------- 1 2.0 5.0 0 2
- 0.11216 1 0 ack 40 ------- 1 2.0 5.0 0 2
r 0.118224 1 0 ack 40 ------- 1 2.0 5.0 0 2
+ 0.118224 0 5 ack 40 ------- 1 2.0 5.0 0 2
- 0.118224 0 5 ack 40 ------- 1 2.0 5.0 0 2
r 0.120256 0 5 ack 40 ------- 1 2.0 5.0 0 2

and here is my code:
Code:
 BEGIN{
	  split("", time)
         }

$1 ~ /^\+/ && $10~/^2\..*/ &&  $4==0 &&$8 ==1 && $5=="tcp" && $6==40 {
                                          t_arr[$12] = $2;
					 print "start time " $2
}

$1 ~ /^r/  && $9 ~/^2\..*/ && $3==0&& $8 ==1 && $5=="ack" && $6==40{
                                      time[$12] = $2 - t_arr[$12]
				      print "end time " $2
}
							
END{
      for( i in time){
          print "time : " time[i]
       }
			

  }

any suggestions what is the problem ???Smilie
# 2  
Old 11-08-2013
Quote:
Originally Posted by ENG_MOHD
I have a large file where I want to extract the data by using awk script. I have made a small sample of the input data. I have in the awk script two condition . The first one is to collect the initial time and the second one to collect the end time. I stored the difference between (Time=end-start) in an array. However, when I'm trying to loop through the array to print the results, I'm not getting the difference correct. It seems the code storing the end time in the Time array instead of the difference.

here is the sample ;
Code:
+ 0.1 5 0 tcp 40 ------- 1 5.0 2.0 0 0
- 0.1 5 0 tcp 40 ------- 1 5.0 2.0 0 0
r 0.102032 5 0 tcp 40 ------- 1 5.0 2.0 0 0
+ 0.102032 0 1 tcp 40 ------- 1 5.0 2.0 0 0
- 0.102032 0 1 tcp 40 ------- 1 5.0 2.0 0 0
r 0.108096 0 1 tcp 40 ------- 1 5.0 2.0 0 0
+ 0.108096 1 2 tcp 40 ------- 1 5.0 2.0 0 0
- 0.108096 1 2 tcp 40 ------- 1 5.0 2.0 0 0
r 0.110128 1 2 tcp 40 ------- 1 5.0 2.0 0 0
+ 0.110128 2 1 ack 40 ------- 1 2.0 5.0 0 2
- 0.110128 2 1 ack 40 ------- 1 2.0 5.0 0 2
r 0.11216 2 1 ack 40 ------- 1 2.0 5.0 0 2
+ 0.11216 1 0 ack 40 ------- 1 2.0 5.0 0 2
- 0.11216 1 0 ack 40 ------- 1 2.0 5.0 0 2
r 0.118224 1 0 ack 40 ------- 1 2.0 5.0 0 2
+ 0.118224 0 5 ack 40 ------- 1 2.0 5.0 0 2
- 0.118224 0 5 ack 40 ------- 1 2.0 5.0 0 2
r 0.120256 0 5 ack 40 ------- 1 2.0 5.0 0 2

and here is my code:
Code:
 BEGIN{
      split("", time)
         }

$1 ~ /^\+/ && $10~/^2\..*/ &&  $4==0 &&$8 ==1 && $5=="tcp" && $6==40 {
                                          t_arr[$12] = $2;
                     print "start time " $2
}

$1 ~ /^r/  && $9 ~/^2\..*/ && $3==0&& $8 ==1 && $5=="ack" && $6==40{
                                      time[$12] = $2 - t_arr[$12]
                      print "end time " $2
}
                            
END{
      for( i in time){
          print "time : " time[i]
       }
            

  }

any suggestions what is the problem ???Smilie
Split usage is wrong
from gnu
Code:
split(string, array [, fieldsep [, seps ] ])

Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in array[1], the second piece in array[2], and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records; see Regexp Field Splitting). If fieldsep is omitted, the value of FS is used. split() returns the number of elements created. seps is a gawk extension with seps[i] being the separator string between array[i] and array[i+1]. If fieldsep is a single space then any leading whitespace goes into seps[0] and any trailing whitespace goes into seps[n] where n is the return value of split() (that is, the number of elements in array).

The split() function splits strings into pieces in a manner similar to the way input lines are split into fields. For example:
Code:
split("cul-de-sac", a, "-", seps)

splits the string ‘cul-de-sac' into three fields using ‘-' as the separator. It sets the contents of the array a as follows:
Code:
 a[1] = "cul"           a[2] = "de"           a[3] = "sac"

and sets the contents of the array seps as follows:
Code:
 seps[1] = "-"           seps[2] = "-"

# 3  
Old 11-08-2013
Output for me, applying your code to your sample:
Code:
start time 0.1
end time 0.120256
time : 0.120256

Your patterns are very restrictive! In your sample, only line 1 satisfies pattern 1, and only line 18 satisfies pattern 2. As your index to both the time and t_arr arrays is $12, which is 0 for line 1 and 2 for line 18, there can't any difference be calculated as the subtrahend is 0 (undefined).
# 4  
Old 11-08-2013
Quote:
Originally Posted by Akshay Hegde
Split usage is wrong
from gnu
Code:
split(string, array [, fieldsep [, seps ] ])

Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in array[1], the second piece in array[2], and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records; see Regexp Field Splitting). If fieldsep is omitted, the value of FS is used. split() returns the number of elements created. seps is a gawk extension with seps[i] being the separator string between array[i] and array[i+1]. If fieldsep is a single space then any leading whitespace goes into seps[0] and any trailing whitespace goes into seps[n] where n is the return value of split() (that is, the number of elements in array).

The split() function splits strings into pieces in a manner similar to the way input lines are split into fields. For example:
Code:
split("cul-de-sac", a, "-", seps)

splits the string ‘cul-de-sac' into three fields using ‘-' as the separator. It sets the contents of the array a as follows:
Code:
 a[1] = "cul"           a[2] = "de"           a[3] = "sac"

and sets the contents of the array seps as follows:
Code:
 seps[1] = "-"           seps[2] = "-"

The use of split() shown here isn't "wrong", it just doesn't contribute anything to this script. If the 3rd argument to split() is omitted, FS is used as the default field separator extended regular expression. Calling split(string, array[, separator]) destroys any existing array named array and then splits fields from string into array with fields delimited by the field separators specified by the ERE defined by the given (or default) separator. And, when string is an empty string, the only thing this call does is remove any existing definition of the array named by array. But, in the supplied script, this is done in a BEGIN {action} clause, but all arrays are already in this state when the program starts. So, the entire BEGIN clause can be removed from this program and have absolutely no effect on the output produced (other than allowing the script to run slightly faster).
This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 11-08-2013
Thank you Don Cragun, and my apology for treating it as wrong. I was bit confused about BEGIN block, and I thought he/she may not know split usage, so simply he/she might have used it, and I too noticed that which has no effect in program. sorry once again.
# 6  
Old 11-08-2013
Quote:
Originally Posted by Akshay Hegde
Thank you Don Cragun, and my apology for treating it as wrong. I was bit confused about BEGIN block, and I thought he/she may not know split usage, so simply he/she might have used it, and I too noticed that which has no effect in program. sorry once again.
There is no reason to apologize for this! We're all here to learn.

It just looked to me like neither you nor original submitter understood what this call to split() was doing. I only hoped that my note on this topic would help both of you (and anyone else reading this thread who might have been confused by that call) to better understand how awk works.
This User Gave Thanks to Don Cragun For This Post:
# 7  
Old 11-08-2013
Thanks for the replies.

@RudiC, I have created the sample so that each condition meet only one line. In the first condition I was trying to take the time and store it in t_arr[] and in the second condition I was trying to subtract the time that was recorded from the in t_arr[] from the current time ( which is of $2 of the second condition ).
Yes, you are right I made a mistake as t_arr[$12] will be zero.

How do I get the out put like :
Code:
start time 0.1
end time 0.120256
time : 0.020256

Maybe if I use 3 arrays !!!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

sed inside the awk script to replace a string in the array

The requirement is i need to find an array value matching with pattern {5:{ , replace that with 5: and reassign that to same array index and print it. I write something like below and the issue is sed command is not working. If i replace " with "`" the script gives syntax error.how can i... (8 Replies)
Discussion started by: bhagya123
8 Replies

2. Shell Programming and Scripting

Unable to print python array in shell script loop.

I am unable to loop print a python string array in my unix shell script: ~/readarr.sh '{{ myarr }}' more readarr.sh echo "Parameter 1:"$1 MYARRAY= $1 IFS= MYARRAY=`python <<< "print ' '.join($MYARRAY)"` for a in "$MYARRAY"; do echo "Printing Array: $a" done Can you... (10 Replies)
Discussion started by: mohtashims
10 Replies

3. Shell Programming and Scripting

Shell script to loop and store in array

I'm trying to achieve the follwoinig with no luck. Find the directories that are greater than 50GB in size and pick the owner of the directory as I would like to send an alert notification. du -sh * | sort -rh 139G Dir_1 84G Dir_2 15G Dir_3 ls -l Dir_1 drwx------ 2... (3 Replies)
Discussion started by: 308002184
3 Replies

4. Linux

Problem with my loop and awk script

Sorry if this is a super simple issue, but am extremely new to this and am trying to teach myself as I go along. But can someone please help me out? I have a data file similar to this for many samples, for all chromosomes Sample Chr bp p roh Sample1 1 49598178 0 1... (14 Replies)
Discussion started by: vuvuzelo
14 Replies

5. Shell Programming and Scripting

awk loop using array:wish to store array values from loop for use outside loop

Here's my code: awk -F '' 'NR==FNR { if (/time/ && $5>10) A=$2" "$3":"$4":"($5-01) else if (/time/ && $5<01) A=$2" "$3":"$4-01":"(59-$5) else if (/time/ && $5<=10) A=$2" "$3":"$4":0"($5-01) else if (/close/) { B=0 n1=n2; ... (2 Replies)
Discussion started by: klane
2 Replies

6. Shell Programming and Scripting

Array Variable being Assigned Values in Loop, But Gone when Loop Completes???

Hello All, Maybe I'm Missing something here but I have NOOO idea what the heck is going on with this....? I have a Variable that contains a PATTERN of what I'm considering "Illegal Characters". So what I'm doing is looping through a string containing some of these "Illegal Characters". Now... (5 Replies)
Discussion started by: mrm5102
5 Replies

7. Shell Programming and Scripting

awk output error while loop through array

Have built this script, the output is what I needed, but NR 6 is omitted. Why? Is it an error? I am using Gawk. '{nr=$2;f = $1} END{for (i=1;i<=f;i++) if (nr != i) print i, nr }' input1.csv >output1.csvinput1.csv 1 9 3 5 4 1 7 6 8 5 10 6 output1.csv > with the missing line number 6. 6 is... (5 Replies)
Discussion started by: sdf
5 Replies

8. Shell Programming and Scripting

Help with awk array syntax & counting script

..... (3 Replies)
Discussion started by: elbee11
3 Replies

9. Shell Programming and Scripting

Help with gawk array, loop in tcsh script

Hi, I'm trying to break a large csv file into smaller files and use unique values for the file names. The shell script i'm using is tcsh and i'm after a gawk one-liner to get the desired outcome. To keep things simple I have the following example with the desired output. fruitlist.csv apples... (6 Replies)
Discussion started by: theflamingmoe
6 Replies

10. Shell Programming and Scripting

Help with awk in array in while loop

Hi everyone:) I have 2 files - IN & OUT. Example: IN A:13:30 B:45:40 . . . UNLIMITED OUT Z:12:24 Y:20:15 . . . UNLIMITED I want first row of numbers of IN - OUT. Example 13-12 45-20 My code is (2 Replies)
Discussion started by: vincyoxy
2 Replies
Login or Register to Ask a Question