but i would like to learn the master code of your Mr Don..eager to learn
That is a laudable attitude.
Quote:
Originally Posted by Don Cragun
First, to appreciate what each part of the above command (actually a pipeline of three different commands) does you might want to redirect the output into a file, examine this and then run the file through the next step to see what this does. I suggest you use a small input file so that it is easy to oversee the output and notice any changes. You can even use several slightly altered versions of an input file to see how it affects the outcome.
In one word: its only files, which you can copy infinitely - play around.
Let us start with the last part. "sed" is a non-interactive text-editor. It gets a script containing changes it should make in a text file and then does these changes. Here, two change rules are in the script:
These are "substitution"-rules: they search for a pattern in the first part, then substitute it with what is in the last part:
Usually "/" is used as delimiter, but as Don wanted to replace "/" he couldn't use it as delimiter, therefore he went for "#". He replaces a space char with a "/". This rule is there twice because per default each rule only subsitutes the first occurrance and he wanted to change the first two.
This sorts the output. I suggest you read the man page of all the commands used but the man page of this one will explain most: He constructs a sorting key for the date. As the date format is "M/D/Y" he first sorts on the year (field 3), then on the month (field 1), then on the day (field 2). Only then he sorts on field 4. All but the last key parts are sorted numerically.
Finally, the core piece: a really elegant awk script, which consists of three parts.
awk processes input files line by line. The middle part:
is what is executed for every line of the input file. It adds the content of the field over which to sum to a record in an associative array with the key value(s) as array index. This way lines with identical key values get summed automatically.
The first part:
Is executed once before the first line of the input file is read. It sets up the "Field Separator" and the "Output Field Separator" and a variable "s", which holds a single space. When you use "$1" (field 1) or "$2" (field 2) in an "awk" script it has to be told how to separate "field 1" from "field 2". It does so by splitting the input line at a "field separator" character. Per default this is a space, but Don redefines it here so that "field" is what you said it should be.
The last part
is executed once after the last line of the input is processed. This here is a simple for-llop which outputs the associative array which was collected in the middle part in a formatted way.
Hi,
I am new to this forum and new to awk.
I have a file that contains 2 columns.
Heres an example of what it looks like:
10 +
20 +
40 +
50 -
70 -
So the file is tab-delimited. What I want to do is add 10 to column 1 whenever column 2 is + and substract 10 from column 1... (1 Reply)
i have a file - it will be in sorted order on column 1
abc 0 1
abc 2 3
abc 3 5
def 1 7
def 0 1
--------
i'd like (awk maybe?) to get the results (any ideas)???
abc 5 9
def 1 8 (2 Replies)
Hi,
I have below as i/p file:
5ABC 36488989 K 000010000ASB BYTRES
5PQR 45757754 K 000200005KPC HGTRET
5ABC 36488989 K 000045000ASB HGTRET
5GTH 36488989 K 000200200ASB BYTRES
5FTU ... (2 Replies)
I have a following inputfile
MT,AP,CDM,TTML,MUM,GS,SUCC,3
MT,AP,CDM,TTSL,AP,GS,FAIL,9
MT,AP,CDM,RCom,MAH,GS,SUCC,3
MT,AP,CDM,RTL,HP,GS,SUCC,1
MT,AP,CDM,Uni,UPE,GS,SUCC,2
MT,AP,CDM,Uni,MUM,GS,SUCC,2
TTSL,AP,GS,MT,MAH,CDM,SUCC,20
TTML,AP,GS,MT,MAH,CDM,FAIL,10... (2 Replies)
i have file input
aaa ccc,45567,rterw,1
bbb dcs,564543,hjghgh,1
aaa ccc,454,rterw,6
i want to sum based on column 1
expected output
aaa ccc,7
bbb dcs,1 (4 Replies)
Hi,
I have a similar input format-
A_1 2
B_0 4
A_1 1
B_2 5
A_4 1
and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks!
letter number_of_letters... (5 Replies)
Hi,
I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column)
The input is for example, after sorted:
K00001 1 1 4 3... (8 Replies)
Hi All,
I have a requirement where I need to find sum of values from column D through O present in a CSV file and check whether the sum of each Individual column matches with the value present for that corresponding column present in the trailer record.
For example, let's assume for column D... (9 Replies)
Hello,
I am trying to store sum of a column as a new column inside a file but have to find the column names dynamically
I/p
c1,c2,c3,c4,c5
10,20,30,40,50
20,30,40,50,60
If i want to find sum only column c1, c3 and output it as c6,c7
O/p
c1,c2,c3,c4,c5,c6,c7
10,20,30,40,50,30,70... (6 Replies)
Hi All,
I have a file as below and want to sum based on the id in the first column
Input
10264;ATE; 12
10265;SES;11
10266AUT;50
10264;ATE;10
10265;SES;13
10266AUT;89
10264;ATE;1
10265;SES;15
10266AUT;78
Output
10264;ATE; 23
10265;SES;39
10266AUT;139 (6 Replies)