Awk: Modifying columns based on comparison


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Awk: Modifying columns based on comparison
# 1  
Old 04-09-2014
Awk: Modifying columns based on comparison

Hi,
I have following input in the file in which i want to club the entries based on $1. Also $11 is equal to $13 of other record(where $13 must be on higher side for any $1) then sum all other fields except $11 & $13. Final output required is as follows:

INPUTFILE:
Code:
SSCHPD01S201404010924210317.PDSN,0,0,16,0,906,20,7,619,523,4,476,2571,20140401092421
SSCHPD01S201404010924210317.PDSN,,,,,,,,,,4,,4,20140401092421
SSCHPD01S201404011225380579.PDSN,0,0,3,0,823,19,4,615,632,4,483,2583,20140401122538
SSCHPD01S201404011225380579.PDSN,2,,,,,,,,,2,,4,20140401122538

OUTPUT:
Code:
SSCHPD01S201404010924210317.PDSN,0,0,16,0,906,20,7,619,523,4,476,2571,20140401092421
SSCHPD01S201404011225380579.PDSN,2,0,3,0,823,19,4,615,632,2,483,2583,20140401122538

# 2  
Old 04-09-2014
What have you tried so far?
# 3  
Old 04-10-2014
1. First pulling uniq files from INPUTFILE in UNIQFILES
2. Finding duplicate entries based on $1 in DUPENTRIES using
Code:
awk -F, 'dup[$1]++' INPUTFILE>DUPENTRIES

3. Then running the below code to a failure result. Not getting any wayout. Please help
Code:
awk -F, 'FNR==1 {++filecounter}
filecounter==1 {RAJ[$1]=$2;ROWB[$1]=$3;KOL[$1]=$4;UPW[$1]=$5;CHN[$1]=$6;DEL[$1]
=$7;GUJ[$1]=$8;KRL[$1]=$9;KTK[$1]=$10;REJ[$1]=$11;DUP[$1]=$12;TOT[$1]=$13;next}
filecounter==2 {FILE[$1]=1;next}
{
if($1 in FILE && $11==TOT[$1] && $11 < REJ[$1] && $13 > TOT[$1])
print $1,$2+RAJ[$1],$3+ROWB[$1],$4+KOL[$1],$5+UPW[$1],$6+CHN[$1],$7+DEL[$1],$8+
GUJ[$1],$9+KRL[$1],$10+KTK[$1],$11-REJ[$1],$12+DUP[$1],$13,$14
}' OFS="," dupentries UNIQFILES INPUTFILE

# 4  
Old 04-10-2014
Try :

Code:
$ cat file
SSCHPD01S201404010924210317.PDSN,0,0,16,0,906,20,7,619,523,4,476,2571,20140401092421
SSCHPD01S201404010924210317.PDSN,,,,,,,,,,4,,4,20140401092421
SSCHPD01S201404011225380579.PDSN,0,0,3,0,823,19,4,615,632,4,483,2583,20140401122538
SSCHPD01S201404011225380579.PDSN,2,,,,,,,,,2,,4,20140401122538

Code:
awk -F, '{
            for(i=2;i<=NF;i++)
            if(length($i)){
			    ind = $1" "$NF" "i
			    A[ind] = (i == 13  ) ? (A[ind] && A[ind]>$i) ?  A[ind]: $i : i==11 || i==NF ? $i : A[ind] ? (A[ind]+$i) : $i
                          }

	    if(!( ($1" "$NF) in U))
	    {
		USeq[++p] = $1" "$NF
		U[$1" "$NF]
	    }
         }
	
	function sortme(Arr,key, i,tmp,k,s){
	
		for( i in Arr )
		{
			split(i,X," ")
			if((X[1]" "X[2]) == key)
			tmp[++k]=X[3]
		}
	
		n = asort(tmp)
		for(i=1;i<=n;i++)
		{
			s = (s!="")? s OFS Arr[key" "tmp[i]] : Arr[key" "tmp[i]]
			
		}
			
			return  s
			
	}

      END{
            for(i=1;i<=p;i++){ 
				split(USeq[i],X," ")
                    		print X[1],sortme(A,USeq[i])
		             }
         }' OFS=','  file

Resulting

Code:
SSCHPD01S201404010924210317.PDSN,0,0,16,0,906,20,7,619,523,4,476,2571,20140401092421
SSCHPD01S201404011225380579.PDSN,2,0,3,0,823,19,4,615,632,2,483,2583,20140401122538


Last edited by Akshay Hegde; 04-10-2014 at 05:39 AM.. Reason: missed something
# 5  
Old 04-10-2014
See the results as per the code shared, $1 is same which is not required. Please refer my output again. It should do basis on $1 .

Code:
SSCHPD01S201404011225380579.PDSN,0,0,16,0,906,20,7,619,523,4,476,2571,20140401092421
SSCHPD01S201404011225380579.PDSN,2,0,3,0,823,19,4,615,632,2,483,2583,20140401122538

# 6  
Old 04-10-2014
Copy updated code
# 7  
Old 04-10-2014
Thanks it did worked but for input like the below this doesnt work. Please suggest

INPUT
Code:
SSCHPD01S201404010924210318.PDSN,1,1,1,1,1,1,1,1,1,10,,19,20140401092421
SSCHPD01S201404010924210318.PDSN,0,0,16,0,906,20,7,619,523,19,476,2571,20140401092421
SSCHPD01S201404011225380559.PDSN,0,0,3,0,823,19,4,615,632,4,483,2583,20140401122538
SSCHPD01S201404011225380559.PDSN,1,,,,,,,,1,2,,4,20140401122538
SSCHPD01S201404010924210317.PDSN,,,,,,,,,,4,,4,20140401092421
SSCHPD01S201404010924210317.PDSN,0,0,16,0,906,20,7,619,523,4,476,2571,20140401092421
SSCHPD01S201404011225380579.PDSN,0,0,3,0,823,19,4,615,632,4,483,2583,20140401122538
SSCHPD01S201404011225380579.PDSN,2,,,,,,,,,2,,4,20140401122538

OUTPUT REQUIRED:
Code:
SSCHPD01S201404010924210318.PDSN,1,1,17,1,907,21,8,620,524,10,476,2571,20140401092421
SSCHPD01S201404011225380559.PDSN,1,0,3,0,823,19,4,615,633,2,483,2583,20140401122538
SSCHPD01S201404010924210317.PDSN,0,0,16,0,906,20,7,619,523,4,476,2571,20140401092421
SSCHPD01S201404011225380579.PDSN,2,0,3,0,823,19,4,615,632,2,483,2583,20140401122538

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Conversion of rows to columns using awk based om column value

HI, My Input file data is dn:adcfgeneral id:13343 Name:xxxxxx Password:iutyerwuitywue wpuwt tuiytruityrutyrwtyrwp dn:cdferwjyyyy id:3875 Name:yyyy Password :hgfdsjkfhdsfkdlshf dshfkldshfdklsfh interset:uiuiufj My output should be ... (6 Replies)
Discussion started by: dineshaila
6 Replies

2. Shell Programming and Scripting

Merging two file based on comparison of first columns

Respected Members. Hello. This is my first post in the forum. I will try to follow all the rules as prescribed by the forum. In case of non-compliance, I request you to kindly give me some more time to understand and abide by them. I am working on two files. I wish to merge the two files... (6 Replies)
Discussion started by: manojmalhotra
6 Replies

3. Shell Programming and Scripting

Merging two file based on comparison of first columns

Respected Members. Hello. This is my first post in the forum. I will try to follow all the rules as prescribed by the forum. In case of non-compliance, I request you to kindly give me some more time to understand and abide by them. I am working on two files. I wish to merge the two files... (1 Reply)
Discussion started by: manojmalhotra
1 Replies

4. Shell Programming and Scripting

Match files based on either of the two columns awk

Dear Shell experts, I have 2 files with structure: File 1: ID and count head test_GI_count1.txt 1000094 2 10039307 1 10039641 1 10047177 11 10047359 1 1008555 2 10120302 1 10120672 13 10121776 1 10121865 32 And 2nd file: head Protein_gi_GeneID_symbol.txt protein_gi GeneID... (11 Replies)
Discussion started by: smitra
11 Replies

5. Shell Programming and Scripting

How to add columns based on a pattern using awk?

Hi, I have a file with more than 1000 lines with ~14 columns. I need to find all the lines with matching value in column 14 and then add column 6 in all the lines before printing them out.. e.g if this is the input file: abc test input 10 for process 2345 abc test input 15 for process 2348... (1 Reply)
Discussion started by: xkdasari
1 Replies

6. Shell Programming and Scripting

Help with awk replacing identical columns based on another file

Hello, I am using Awk in UBUNTU 12.04. I have a file like following with three fields and 44706 rows. F1 A A F2 G G F3 A T I have another file like this: AL_1 F1 A A AL_2 F1 A T AL_3 F1 A A AL_1 F2 G G AL_2 F2 G A AL_3 F2 G G BO_1 F1 A A BO_2 F1 A T... (6 Replies)
Discussion started by: Homa
6 Replies

7. Shell Programming and Scripting

awk based script to ignore all columns from a file which contains character strings

Hello All, I have a .CSV file where I expect all numeric data in all the columns other than column headers. But sometimes I get the files (result of statistics computation by other persons) like below( sample data) SNO,Data1,Data2,Data3 1,2,3,4 2,3,4,SOME STRING 3,4,Inf,5 4,5,4,4 I... (9 Replies)
Discussion started by: ks_reddy
9 Replies

8. Shell Programming and Scripting

Please Help!!!! Awk for summing columns based on selected column value

a,b,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,aa,bb,cc,dd,ee,ff,gg,hh,ii a thru ii are digits and strings.... The awk needed....if coloumn 9 == i (coloumn 9 is string ), output the sum of x's(coloumn 22 ) in all records and sum of y's (coloumn 23 ) in all records in a file (records.txt).... (6 Replies)
Discussion started by: BrownBob
6 Replies

9. Shell Programming and Scripting

awk : extracting unique lines based on columns

Hi, snp.txt CHR_A SNP_A BP_A_st BP_A_End CHR_B BP_B SNP_B R2 p-SNP_A p-SNP_B 5 rs1988728 74904317 74904318 5 74960646 rs1427924 0.377333 0.000740085 0.013930081 5 ... (12 Replies)
Discussion started by: genehunter
12 Replies

10. Shell Programming and Scripting

awk 3 files to one based on multiple columns

Hi all, I have three files, one is a navigation file, one is a depth file and one is a file containing the measured field of gravity. The formats of the files are; navigation file: 2006 320 17 39 0 0 *nav 21.31542 -157.887 2006 320 17 39 10 0 *nav 21.31542 -157.887 2006 320 17 39 20 0... (2 Replies)
Discussion started by: andrealphus
2 Replies
Login or Register to Ask a Question