Awk/sed summation of one column based on some entry in first column


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Awk/sed summation of one column based on some entry in first column
# 1  
Awk/sed summation of one column based on some entry in first column

Hi All ,

I am having an input file as stated below

Input file

Code:
6  ddk/djhdj/djhdj/Q  10 0.5 
    dhd/jdjd.djd.nd/QB 01 0.5 
    hdhd/jd/jd/jdj/Q  10 0.5
512 hd/hdh/gdh/Q 01 0.5
      jdjd/jd/ud/j/QB 10 0.5 
      HD/jsj/djd/Q  01 0.5 
71 hdh/jjd/dj/jd/Q  10 0.5
    jd/jdld/je/j/QB 01 0.5
    IDP/jd/jdd/Q 10 0.5 
    1K/JDJ/JDJK/QL 01 0.5

I need to take the summation of the 4th column and it should start based on the first column and print at the 5th column at every entry of first column .

My output file

Code:
6  ddk/djhdj/djhdj/Q  10 0.5  1.5
    dhd/jdjd.djd.nd/QB 01 0.5 
    hdhd/jd/jd/jdj/Q  10 0.5
512 hd/hdh/gdh/Q 01 0.5  1.5
      jdjd/jd/ud/j/QB 10 0.5 
      HD/jsj/djd/Q  01 0.5 
71 hdh/jjd/dj/jd/Q  10 0.5 2.0
    jd/jdld/je/j/QB 01 0.5
    IDP/jd/jdd/Q 10 0.5 
    1K/JDJ/JDJK/QL 01 0.5

I tried something like this but not able to extract the right values
Code:
awk '
BEGIN { FS=OFS="\t" }
NR==FNR { s[$1]+=$4; next }
{ print $0,s[$1] }
' Final.tran.map.pattern5

Please let me know.
# 2  
Try
Code:
tac file | awk -F" +" '{SUM += $4} $1 {$5 = SUM; SUM = 0} 1' | tac

# 3  
Thanks a lot Rudic ,

Could you give a brief explanation of this code, I am not able to get why we are selecting the field separator as " +"
?

Thanks and Regards
Kshitij Kulshreshtha

Last edited by RavinderSingh13; 12-07-2019 at 07:37 AM..
# 4  
Hello kshitij,

Could you please try following.

Code:
awk '
FNR==NR{
  if($0~/^[0-9]+/){
     ++count
  }
  sum[count]+=$NF
  next
}
/^[0-9]+/{
  print $0,sum[++var]
  next
}
1
'  Input_file  Input_file

Output will be as follows.

Code:
6  ddk/djhdj/djhdj/Q  10 0.5  1.5
    dhd/jdjd.djd.nd/QB 01 0.5
    hdhd/jd/jd/jdj/Q  10 0.5
512 hd/hdh/gdh/Q 01 0.5 1.5
      jdjd/jd/ud/j/QB 10 0.5
      HD/jsj/djd/Q  01 0.5
71 hdh/jjd/dj/jd/Q  10 0.5 2
    jd/jdld/je/j/QB 01 0.5
    IDP/jd/jdd/Q 10 0.5
    1K/JDJ/JDJK/QL 01 0.5

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 5  
Your input file is not too consistent in its usage of spaces as field separators. It has one or two between fields, one up to six from BOL to $2 if $1 is missing, and zero or one at EOL. That's what " +" is for: it stands for "one or many" spaces. c.f. man regex for further reference.
This User Gave Thanks to RudiC For This Post:
# 6  
Quote:
Originally Posted by RavinderSingh13
Hello kshitij,
Could you please try following.
Code:
awk '
FNR==NR{
  if($0~/^[0-9]+/){
     ++count
  }
  sum[count]+=$NF
  next
}
/^[0-9]+/{
  print $0,sum[++var]
  next
}
1
'  Input_file  Input_file

Output will be as follows.

Code:
6  ddk/djhdj/djhdj/Q  10 0.5  1.5
    dhd/jdjd.djd.nd/QB 01 0.5
    hdhd/jd/jd/jdj/Q  10 0.5
512 hd/hdh/gdh/Q 01 0.5 1.5
      jdjd/jd/ud/j/QB 10 0.5
      HD/jsj/djd/Q  01 0.5
71 hdh/jjd/dj/jd/Q  10 0.5 2
    jd/jdld/je/j/QB 01 0.5
    IDP/jd/jdd/Q 10 0.5
    1K/JDJ/JDJK/QL 01 0.5

Thanks,
R. Singh
Hello kshitij,

Adding a detailed level explanation for my code.

Code:
awk '                            ##Starting awk program from here.
FNR==NR{                         ##Checking condition FNR==NR which will be TRUE when Input_file is being read first time.
  if($0~/^[0-9]+/){              ##Checking condition if a line starts from digit then do following.
     ++count                     ##Increment variable count with 1, each time cursor comes here.
  }                              ##Closing BLOCK for if condition here.
  sum[count]+=$NF                ##Creating an array named sum with index of count and keep on adding value of $NF to its own value.
  next                           ##next will skip all further statements from here.
}                                ##Closing BLOCK for condition FNR==NR here.
/^[0-9]+/{                       ##Checking condition if a line starts from digit then do following.
  print $0,sum[++var]            ##Printing current line and array sum with index of variable var with its increment of 1 each time cursor comes here.
  next                           ##next will skip all further statements from here.
}                                ##Closing BLOCK for /^[0-9]+/ condition here.
1                                ##Mentioning 1 for printing edited/non-edited line here.
'  Input_file  Input_file        ##Mentioning Input_file 2 times here.

Thanks,
R. Singh
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #890
Difficulty: Easy
In sed, the REGEX dot (.) matches exactly one character.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed command to add a new column entry

My input file looks like this 12 3 5.122.281.413 172.31.15.220 3421 4133 2 2 1454 3421 4133 2 2 0 12 44036 214.215.52.146 90.123.245.211 2312 3911 4 4 521 2312 3911 4 4 1 14 504 6.254.324.219 192.61.27.120 4444 5611 7 5 1415 4444 5611 7 5 1 ... (2 Replies)
Discussion started by: sampitosh
2 Replies

2. Shell Programming and Scripting

Solution for replacement of 4th column with 3rd column in a file using awk/sed preserving delimters

input "A","B","C,D","E","F" "S","T","U,V","W","X" "AA","BB","CC,DD","EEEE","FFF" required output: "A","B","C,D","C,D","F" "S", T","U,V","U,V","X" "AA","BB","CC,DD","CC,DD","FFF" tried using awk but double quotes not preserving for every field. any help to solve this is much... (5 Replies)
Discussion started by: khblts
5 Replies

3. Shell Programming and Scripting

awk script to call another script based on second column entry

Hi I have a text file (Input.txt) with two column entries separated by tab as given below: aaa str1 bbb str2 cccccc str3 dddd str4 eee str3 ssss str2 sdf str3 hhh str1 fff str2 ccc str3 ..... ..... ..... (1 Reply)
Discussion started by: my_Perl
1 Replies

4. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Hi, I have a similar input format- A_1 2 B_0 4 A_1 1 B_2 5 A_4 1 and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks! letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies

5. Shell Programming and Scripting

Pick the column value based on another column using awk or CUT

My scenario is that I need to pick value from third column based on fourth column value, if fourth column value is 1 then first value of third column.Third column (2|3|4|6|1) values are cancatenated. Please someone help me to resolve this issue. Source column1 column2 column3 column4... (2 Replies)
Discussion started by: Ganesh L
2 Replies

6. Shell Programming and Scripting

awk or sed: change the color of a column w/o screwing up column spacing

Hey folks. I wrote a little awk script that summarizes /proc/net/dev info and then pipes it to the nix column command to set up column spacing appropriately. Here's some example output: Iface RxMBytes RxPackets RxErrs RxDrop TxMBytes TxPackets TxErrs TxDrop bond0 9 83830... (3 Replies)
Discussion started by: ryran
3 Replies

7. Shell Programming and Scripting

Rename a header column by adding another column entry to the header column name URGENT!!

Hi All, I have a file example.csv which looks like this GrpID,TargetID,Signal,Avg_Num CSCH74_1_1,2007,61,256 CSCH74_1_1,212007,647,679 CSCH74_1_1,12007,3,32 CSCH74_1_1,207,299,777 I want the output as GrpID,TragetID,Signal-CSCH74_1_1,Avg_Num CSCH74_1_1,2007,61,256... (4 Replies)
Discussion started by: Vavad
4 Replies

8. Shell Programming and Scripting

for each different entry in column 1 extract maximum values from column 2 in unix/awk

Hello, I have 2 columns (1st column has multiple entries but the corresponding values in the column 2 may be the same or different.) however I want to extract unique values for each entry in column 1 by assigning the max value from column 2 SDF4 -0.211654 SDF4 0.978068 ... (1 Reply)
Discussion started by: Diya123
1 Replies

9. Shell Programming and Scripting

Awk or Sed, fubd match in column, then edit column.

FILE A: 9780743551526,(Abridged) 9780743551779,(Unabridged) 9780743582469,(Abridged) 9780743582483,(Unabridged) 9780743563468,(Abridged) 9780743563475,(Unabridged) FILE B: c3saCandyland 9780743518321 "CANDYLAND" "MCBAIN, ED" 2001 c3sbCandyland 9780743518321 ... (7 Replies)
Discussion started by: glev2005
7 Replies

10. Shell Programming and Scripting

awk/sed column replace using column header - help

$ cat log.txt Name Age Sex Lcation nfld alias xsd CC 25 M XYZ asx KK Y BB 21 F XAS awe SS N SD 21 M AQW rty SD A How can I replace the column with header "Lcation" with the column with header "alias" and delete the "alias" column? so that the final output will become: Name Age Sex... (10 Replies)
Discussion started by: jkl_jkl
10 Replies

Featured Tech Videos