Do replace operation and awk to sum multiple columns if another column has duplicate values


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Do replace operation and awk to sum multiple columns if another column has duplicate values
# 1  
Old 05-04-2018
Do replace operation and awk to sum multiple columns if another column has duplicate values

Hi Experts,
Please bear with me, i need help
I am learning AWk and stuck up in one issue.
First point : I want to sum up column value for column 7, 9, 11,13 and column15 if rows in column 5 are duplicates.No action to be taken for rows where value in column 5 is unique.
Second point : For duplicate rows, need to make them as single row in output file.
Third point : For duplicate rows, want to put all value of column 2 other from input file in column 16 in format (IN272018000235^IN27201523963) in output file other than value which is stored(IN27201800023963) in column 2 in output file.

Ex - In output file, please see column 16 it should have value as IN272018000235^IN27201523963(taken from column2) only from first three rows(duplicate) .Here we have not taken 3rd value(IN27201800023963) in column 16 of output file from column 2 of input file bcz we have already stored in column 2(IN27201800023963) of output file.

Have record of thousand of line..so duplicate rows are not limited to three, there can be 40, 50 any number of duplicate rows. So accordingly need to have value in column 57.

Please let me know if still my query is not clear

Input File

Code:
a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p
    POS|IN27201800023963|2018-04-24||27AACCE5198E1ZJ||500|0||9|45|9|45|||
    POS|IN272018000235|2018-04-24||27AACCE5198E1ZJ||500|0||9|45|9|45|||
    POS|IN27201523963|2018-04-24||27AACCE5198E1ZJ||500|0||9|45|9|45|||
    POS|IN27201800022938|2018-04-05||27AAJFH2012G1ZS||2|4||6||7|8|||

Expected Output File

Code:
   a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p
    POS|IN27201800023963|2018-04-24||27AACCE5198E1ZJ||1500|0||9|135|9|135|||IN272018000235^IN27201523963|
    POS|IN27201800022938|2018-04-05||27AAJFH2012G1ZS||2|4||6||7|8|||


Below is code tried to sum up value but giving output as

Code:
awk 'BEGIN{ FS = OFS = "|" }
       NR > 1{
           if ($5 == f5) {
               $7 += f7; $9 += f9; $11 += f11; $13 += f13;
               $15 += f15; $16 = $16 f2"^"$2
           } else { print rec }
       }
       {
           rec = $0; f5 = $5; f7 = $7; f9 = $9;
           f11 = $11; f13 = $13; f15 = $15; f2 = $2
       }
       END{ print rec }' input.txt

Code output: Not correct because getting column 2 value "IN27201523963" also in column 16 instead it should be like "IN272018000235^IN27201800023963"
The value which is present in column13 should not be present in column16.

Code:
POS|IN27201523963|2018-04-24||27AACCE5198E1ZJ||1500|0|0|9|135|9|135||0|IN272018000235^IN27201523963
POS|IN27201800022938|2018-04-05||27AAJFH2012G1ZS||2|4||6||7|8|||


Last edited by as7951; 05-05-2018 at 07:27 AM.. Reason: Edited to put code i have prepared and error i am getting in output file
# 2  
Old 05-04-2018
In what way is it 'not working'? Some output/errors would be useful to see.


Thanks, in advance,
Robin
# 3  
Old 05-04-2018
Hi rbatte1,

The code is not complete and giving error "Unmatched '." .
# 4  
Old 05-04-2018
There was an unmatched '.
Code:
awk 'BEGIN {FS=OFS="|"} FNR==1 {a[$1]+= (f4[key]+=$4;f6[key]+=$6;f7[key]+=$7;f8[key]+=$8;f10[key]+=$10;f11[key]+=$11;)}' input.txt > output.txt

Try it now?
# 5  
Old 05-04-2018
I don't think it's worth trying, as there are quite some errors in that one-liner. Amongst others: the a[$1] assignment is missing (unless you mean the parenthesized part, which a) is an error, b) returns nothing), key is undefined, all the arithmetics is done for the first line ONLY, no output is printed.
# 6  
Old 05-04-2018
Is this a homework assignment? Homework and coursework questions can only be posted in the Homework & Coursework forum under special homework rules.

Please review the rules, which you agreed to when you registered, if you have not already done so.

If you did not post homework, please explain the source of the data you have shown us and explain why you are trying to do this. The input you have shown us does not match the description you have provided, and the output you say you want from the sample input you provided cannot be derived from the instructions you have provided. If this is not homework, please also explain where you found the start of the awk script that you have shown us, and why you are unable to complete it. (Adding a missing single-quote as noted in the diagnostic you mentioned in post #3 in this thread should not be hard for most awk or shell script writers.)

If you did post homework in the main forums, please review the guidelines for posting homework and repost.
# 7  
Old 05-05-2018
Hi Rudic,
I am confused and not able to complete the code for above query.
can you please help.

---------- Post updated at 10:59 PM ---------- Previous update was at 10:57 PM ----------

Hi Don,
it's not a home work assignment,
i am little confused here and not able to resolve above query, as it is seems complex to me
, i need your help in this.
I write multiple code but none is working.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Copy columns from one file into another and get sum of column values and row count

I have a file abc.csv, from which I need column 24(PurchaseOrder_TotalCost) to get the sum_of_amounts with date and row count into another file say output.csv abc.csv- UTF-8,,,,,,,,,,,,,,,,,,,,,,,,, ... (6 Replies)
Discussion started by: Tahir_M
6 Replies

2. Shell Programming and Scripting

awk to Sum columns when other column has duplicates and append one column value to another with Care

Hi Experts, Please bear with me, i need help I am learning AWk and stuck up in one issue. First point : I want to sum up column value for column 7, 9, 11,13 and column15 if rows in column 5 are duplicates.No action to be taken for rows where value in column 5 is unique. Second point : For... (1 Reply)
Discussion started by: as7951
1 Replies

3. Shell Programming and Scripting

Sum duplicate values in text file through awk between dates

I need to sum values in text file in case duplicate row are present with same name and different value below is example of data in file i have and format i need. Data in text file 20170308 PM,U,2 PM,U,113 PM,I,123 DA,U,135 DA,I,113 DA,I,1 20170309 PM,U,2 PM,U,1 PM,I,123 PM,I,1... (3 Replies)
Discussion started by: Adfire
3 Replies

4. Shell Programming and Scripting

Sum values of specific column in multiple files, considering ranges defined in another file

I have a file (let say file B) like this: File B: A1 3 5 A1 7 9 A2 2 5 A3 1 3 The first column defines a filename and the other two define a range in that specific file. In the same directory, I have also three more files (File A1, A2 and A3). Here is 10 sample lines... (3 Replies)
Discussion started by: Bastami
3 Replies

5. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Hi, I have a similar input format- A_1 2 B_0 4 A_1 1 B_2 5 A_4 1 and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks! letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies

6. UNIX for Dummies Questions & Answers

awk to sum column field from duplicate row/lines

Hello, I am new to Linux environment , I working on Linux script which should send auto email based on the specific condition from log file. Below is the sample log file Name m/c usage abc xxx 10 abc xxx 20 abc xxx 5 xyz ... (6 Replies)
Discussion started by: asjaiswal
6 Replies

7. Shell Programming and Scripting

Replace duplicate columns with values from first occurrence

I've a text file with below values viz. multiple rows with same values in column 3, 4 and 5, which need to be considered as duplicates. For all such cases, the rows from second occurrence onwards should be modified in a way that their values in first two columns are replaced with values as in first... (4 Replies)
Discussion started by: asyed
4 Replies

8. Shell Programming and Scripting

Find and replace duplicate column values in a row

I have file which as 12 columns and values like this 1,2,3,4,5 a,b,c,d,e b,c,a,e,f a,b,e,a,h if you see the first column has duplicate values, I need to identify (print it to console) the duplicate value (which is 'a') and also remove duplicate values like below. I could be in two... (5 Replies)
Discussion started by: nuthalapati
5 Replies

9. Shell Programming and Scripting

sum multiple columns based on column value

i have a file - it will be in sorted order on column 1 abc 0 1 abc 2 3 abc 3 5 def 1 7 def 0 1 -------- i'd like (awk maybe?) to get the results (any ideas)??? abc 5 9 def 1 8 (2 Replies)
Discussion started by: jjoe
2 Replies

10. Shell Programming and Scripting

help sum columns by break in first column with awk or sed or something.

I have some data that is something like this? item: onhand counted location ITEM0001 1 0 a1 ITEM0001 0 1 a2 ITEM0002 5 0 b5 ITEM0002 0 6 c1 I want to sum up... (6 Replies)
Discussion started by: syadnom
6 Replies
Login or Register to Ask a Question