I have a large file where I want to extract the data by using awk script. I have made a small sample of the input data. I have in the awk script two condition . The first one is to collect the initial time and the second one to collect the end time. I stored the difference between (Time=end-start) in an array. However, when I'm trying to loop through the array to print the results, I'm not getting the difference correct. It seems the code storing the end time in the Time array instead of the difference.
here is the sample ;
and here is my code:
any suggestions what is the problem ???
I have a large file where I want to extract the data by using awk script. I have made a small sample of the input data. I have in the awk script two condition . The first one is to collect the initial time and the second one to collect the end time. I stored the difference between (Time=end-start) in an array. However, when I'm trying to loop through the array to print the results, I'm not getting the difference correct. It seems the code storing the end time in the Time array instead of the difference.
here is the sample ;
and here is my code:
any suggestions what is the problem ???
Split usage is wrong
from gnu
Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in array[1], the second piece in array[2], and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records; see Regexp Field Splitting). If fieldsep is omitted, the value of FS is used. split() returns the number of elements created. seps is a gawk extension with seps[i] being the separator string between array[i] and array[i+1]. If fieldsep is a single space then any leading whitespace goes into seps[0] and any trailing whitespace goes into seps[n] where n is the return value of split() (that is, the number of elements in array).
The split() function splits strings into pieces in a manner similar to the way input lines are split into fields. For example:
splits the string ‘cul-de-sac' into three fields using ‘-' as the separator. It sets the contents of the array a as follows:
and sets the contents of the array seps as follows:
Output for me, applying your code to your sample:
Your patterns are very restrictive! In your sample, only line 1 satisfies pattern 1, and only line 18 satisfies pattern 2. As your index to both the time and t_arr arrays is $12, which is 0 for line 1 and 2 for line 18, there can't any difference be calculated as the subtrahend is 0 (undefined).
Split usage is wrong
from gnu
Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in array[1], the second piece in array[2], and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records; see Regexp Field Splitting). If fieldsep is omitted, the value of FS is used. split() returns the number of elements created. seps is a gawk extension with seps[i] being the separator string between array[i] and array[i+1]. If fieldsep is a single space then any leading whitespace goes into seps[0] and any trailing whitespace goes into seps[n] where n is the return value of split() (that is, the number of elements in array).
The split() function splits strings into pieces in a manner similar to the way input lines are split into fields. For example:
splits the string ‘cul-de-sac' into three fields using ‘-' as the separator. It sets the contents of the array a as follows:
and sets the contents of the array seps as follows:
The use of split() shown here isn't "wrong", it just doesn't contribute anything to this script. If the 3rd argument to split() is omitted, FS is used as the default field separator extended regular expression. Calling split(string, array[, separator]) destroys any existing array named array and then splits fields from string into array with fields delimited by the field separators specified by the ERE defined by the given (or default) separator. And, when string is an empty string, the only thing this call does is remove any existing definition of the array named by array. But, in the supplied script, this is done in a BEGIN {action} clause, but all arrays are already in this state when the program starts. So, the entire BEGIN clause can be removed from this program and have absolutely no effect on the output produced (other than allowing the script to run slightly faster).
This User Gave Thanks to Don Cragun For This Post:
Thank you Don Cragun, and my apology for treating it as wrong. I was bit confused about BEGIN block, and I thought he/she may not know split usage, so simply he/she might have used it, and I too noticed that which has no effect in program. sorry once again.
Thank you Don Cragun, and my apology for treating it as wrong. I was bit confused about BEGIN block, and I thought he/she may not know split usage, so simply he/she might have used it, and I too noticed that which has no effect in program. sorry once again.
There is no reason to apologize for this! We're all here to learn.
It just looked to me like neither you nor original submitter understood what this call to split() was doing. I only hoped that my note on this topic would help both of you (and anyone else reading this thread who might have been confused by that call) to better understand how awk works.
This User Gave Thanks to Don Cragun For This Post:
@RudiC, I have created the sample so that each condition meet only one line. In the first condition I was trying to take the time and store it in t_arr[] and in the second condition I was trying to subtract the time that was recorded from the in t_arr[] from the current time ( which is of $2 of the second condition ).
Yes, you are right I made a mistake as t_arr[$12] will be zero.
How do I get the out put like :
Maybe if I use 3 arrays !!!
The requirement is i need to find an array value matching with pattern {5:{ , replace that with 5: and reassign that to same array index and print it.
I write something like below and the issue is sed command is not working. If i replace " with "`" the script gives syntax error.how can i... (8 Replies)
I am unable to loop print a python string array in my unix shell script:
~/readarr.sh '{{ myarr }}'
more readarr.sh
echo "Parameter 1:"$1
MYARRAY= $1
IFS=
MYARRAY=`python <<< "print ' '.join($MYARRAY)"`
for a in "$MYARRAY"; do
echo "Printing Array: $a"
done
Can you... (10 Replies)
I'm trying to achieve the follwoinig with no luck.
Find the directories that are greater than 50GB in size and pick the owner of the directory as I would like to send an alert notification.
du -sh * | sort -rh
139G Dir_1
84G Dir_2
15G Dir_3
ls -l Dir_1
drwx------ 2... (3 Replies)
Sorry if this is a super simple issue, but am extremely new to this and am trying to teach myself as I go along. But can someone please help me out?
I have a data file similar to this for many samples, for all chromosomes
Sample Chr bp p roh
Sample1 1 49598178 0 1... (14 Replies)
Hello All,
Maybe I'm Missing something here but I have NOOO idea what the heck is going on with this....?
I have a Variable that contains a PATTERN of what I'm considering "Illegal Characters". So what I'm doing is looping
through a string containing some of these "Illegal Characters". Now... (5 Replies)
Have built this script, the output is what I needed, but NR 6 is omitted. Why? Is it an error? I am using Gawk.
'{nr=$2;f = $1} END{for (i=1;i<=f;i++) if (nr != i) print i, nr }' input1.csv >output1.csvinput1.csv
1 9
3 5
4 1
7 6
8 5
10 6
output1.csv > with the missing line number 6. 6 is... (5 Replies)
Hi, I'm trying to break a large csv file into smaller files and use unique values for the file names. The shell script i'm using is tcsh and i'm after a gawk one-liner to get the desired outcome. To keep things simple I have the following example with the desired output.
fruitlist.csv
apples... (6 Replies)
Hi everyone:)
I have 2 files - IN & OUT. Example:
IN
A:13:30
B:45:40
.
.
. UNLIMITED
OUT
Z:12:24
Y:20:15
.
.
. UNLIMITED
I want first row of numbers of IN - OUT. Example 13-12 45-20
My code is (2 Replies)