Duplicate values merge | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Duplicate values merge

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 02-09-2013
jiam912's Avatar
jiam912 jiam912 is offline
Registered User
 
Join Date: Aug 2010
Last Activity: 25 June 2014, 2:51 AM EDT
Location: Ecuador
Posts: 122
Thanks: 88
Thanked 0 Times in 0 Posts
Duplicate values merge

Dear Gents,

Please can you help me to solve this problem.

Input file...


Code:
22057485  ,219 ,1050
22057485  ,223 ,1050
21897425  ,278 ,1050
21897425  ,279 ,1050
21897425  ,287 ,1050
20497465  ,602 ,1051
20517500  ,677 ,1051
20517500  ,681 ,1051
20577555  ,775 ,1052
20577555  ,778 ,1052
20357560  ,778 ,1052
20357560  ,780 ,1052
23717535  ,794 ,1053
23717535  ,805 ,1053
23657530  ,797 ,1053
23657530  ,798 ,1053
23657530  ,799 ,1053

I would like to get something like it:

output file


Code:
1050  22057485    219    223    
1050  21897425    278    279    287
1051  20497465    602    603    605
1051  20517500    677    681    
1052  20577555    775    778    
1052  20357560    778    780    
1053  23717535    794    805    
1053  23657530    797    798    799

Thanks in advance
Sponsored Links
    #2  
Old 02-09-2013
elixir_sinari's Avatar
elixir_sinari elixir_sinari is offline Forum Advisor  
Gotham Knight
 
Join Date: Mar 2012
Last Activity: 16 July 2014, 3:22 PM EDT
Location: India
Posts: 1,412
Thanks: 100
Thanked 495 Times in 472 Posts
If you don't mind the order of the output:

Code:
awk -F' *, *' '{c[$3 OFS $1]=c[$3 OFS $1]""?c[$3 OFS $1] OFS $2:$2}
END{for(i in c) print i,c[i]}' OFS='\t' file

The Following User Says Thank You to elixir_sinari For This Useful Post:
jiam912 (02-09-2013)
Sponsored Links
    #3  
Old 02-09-2013
jiam912's Avatar
jiam912 jiam912 is offline
Registered User
 
Join Date: Aug 2010
Last Activity: 25 June 2014, 2:51 AM EDT
Location: Ecuador
Posts: 122
Thanks: 88
Thanked 0 Times in 0 Posts
Thanks a lot its works perfect.
    #4  
Old 02-09-2013
Scrutinizer's Avatar
Scrutinizer Scrutinizer is offline Forum Staff  
Moderator
 
Join Date: Nov 2008
Last Activity: 24 July 2014, 4:06 AM EDT
Location: Amsterdam
Posts: 9,281
Thanks: 260
Thanked 2,303 Times in 2,066 Posts
Try:
Code:
awk -F' *,' 'p!=$1{if(p)print s; s=$3 OFS $1; p=$1}{s=s OFS $2} END{print s}' OFS='\t' file


Last edited by Scrutinizer; 03-04-2013 at 05:10 AM..
The Following User Says Thank You to Scrutinizer For This Useful Post:
jiam912 (02-10-2013)
Sponsored Links
    #5  
Old 02-09-2013
Jotne's Avatar
Jotne Jotne is offline
Registered User
 
Join Date: Dec 2010
Last Activity: 23 July 2014, 4:06 AM EDT
Posts: 1,036
Thanks: 61
Thanked 216 Times in 204 Posts
@Scrutinizer
A very nice solution. I did use nearly one hour to study this simple work to find out how it works. I do admire how you guys manage to find this clever simple solution to the problems.

I just like to explain how this script work, so I have written it some more readable.

Code:
awk -F' *,' '		#1	
p!=$1{			#2
	if(p) print s;	#3
	s=$3 OFS $1;	#4
	p=$1}		#5
{s=s OFS $2} 		#6
END {print s}' \	#7
OFS='\t' file		#8

#1 Setting the Field separator to one or more spaced followed by a comma ' *,'

Run on line one 22057485 ,219 ,1050
$1=22057485 $2=219 $3=1050
#2 test if p is different form $1 , and it is since p=0 (no data)
#3 test if p contains data, no, p is blank, do not print.
#4 set s=$3 OFS $1 s="1050 22057485"
#5 p=$1=22057485
#6 s=s OFS $2 s="1050 22057485 219"
Run on line two 22057485 ,223 ,1050
$1=22057485 $2=223 $3=1050
#2 test if p is different form $1 , and it equal p=22057485 $1=22057485
Jump to #6
#6 s=s OFS $2 s="1050 22057485 219 223"
Run on line three 21897425 ,278 ,1050
$1=21897425 $2=278 $3=1050
#2 test if p is different form $1 , and it is since p=22057485 $1=21897425
#3 test if p contains data, yes print s 1050 22057485 219 223
#4 set s=$3 OFS $1 s="1050 21897425"
#5 p=$1=21897425
#6 s=s OFS $2 s="1050 21897425 278"
Run on line four
.
.
.
#7 END Last job, print the last line print s
#8 setts the Output Field Separator to tab OFS='\t'

Last edited by Jotne; 02-10-2013 at 02:48 AM..
The Following 2 Users Say Thank You to Jotne For This Useful Post:
radoulov (02-10-2013), Scrutinizer (02-09-2013)
Sponsored Links
    #6  
Old 02-10-2013
jiam912's Avatar
jiam912 jiam912 is offline
Registered User
 
Join Date: Aug 2010
Last Activity: 25 June 2014, 2:51 AM EDT
Location: Ecuador
Posts: 122
Thanks: 88
Thanked 0 Times in 0 Posts
Thanks to everybody for your great job.

---------- Post updated at 02:27 PM ---------- Previous update was at 02:37 AM ----------

Gents,

please other thing


Code:
1050  22057485    219    223
1050  21897425    278    279    287 
1051  20497465    602    603    605 
1051  20517500    677    681     
1052  20577555    775    778     
1052  20357560    778    780     
1053  23717535    794    805     
1053  23657530    797    798    799

How i can count the total of values only from the 4 column to the end.

In this case the total of values will be 11.

How I can get this value..?.

Thanks for your help
Sponsored Links
    #7  
Old 02-10-2013
Jotne's Avatar
Jotne Jotne is offline
Registered User
 
Join Date: Dec 2010
Last Activity: 23 July 2014, 4:06 AM EDT
Posts: 1,036
Thanks: 61
Thanked 216 Times in 204 Posts

Code:
awk '{i+=NF-3} END {print i}' infile
11

For me this gives 11 not 15 numbers?

Code:
1050  22057485    219    223
1050  21897425    278    279    287 
1051  20497465    602    603    605 
1051  20517500    677    681     
1052  20577555    775    778     
1052  20357560    778    780     
1053  23717535    794    805     
1053  23657530    797    798    799

Or did I understand this incorrect.
The Following User Says Thank You to Jotne For This Useful Post:
jiam912 (02-10-2013)
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
duplicate values jiam912 Shell Programming and Scripting 2 10-16-2012 04:26 AM
Awk: How to merge duplicate lines and print in a single winter9 Shell Programming and Scripting 10 03-14-2011 02:33 PM
merge files with same row values tonet Shell Programming and Scripting 4 08-31-2010 04:44 PM
Cleaning up Arrays with duplicate values adelsin Shell Programming and Scripting 5 07-24-2010 02:54 PM



All times are GMT -4. The time now is 10:36 AM.