Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 06-15-2012
Registered User
 
Join Date: Dec 2011
Posts: 84
Thanks: 29
Thanked 0 Times in 0 Posts
Compare values of fields from same column with awk

Hi all !

If there is only one single value in a column (e.g. column 1 below), then return this value in the same output column.
If there are several values in the same column (e.g. column 2 below), then return the different values separated by "," in the output.

pipe-separated input:


Code:
blue|red
blue|red
blue|green
blue|red

output:


Code:
blue|red,green

Hope I am clear enough...

Thanks guys !
Sponsored Links
    #2  
Old 06-15-2012
elixir_sinari's Avatar
Gotham Knight
 
Join Date: Mar 2012
Location: India
Posts: 1,370
Thanks: 87
Thanked 476 Times in 456 Posts
Try this:


Code:
awk -F\| 'BEGIN{OFS=FS} {start=match(a[$1],$2);if(start && substr(a[$1],(start+RLENGTH),1) ~ /^[,]*$/) next; if(a[$1]) a[$1]=a[$1]","$2; else a[$1]=$2} END{for(i in a) print i,a[i]}' file


Last edited by elixir_sinari; 06-15-2012 at 05:30 AM..
The Following User Says Thank You to elixir_sinari For This Useful Post:
lucasvs (06-19-2012)
Sponsored Links
    #3  
Old 06-19-2012
Registered User
 
Join Date: Dec 2011
Posts: 84
Thanks: 29
Thanked 0 Times in 0 Posts
Thanks elixir_sinari !

Can you tell me what a[$1] stands for ?

Does it mean "the first column"?
    #4  
Old 06-19-2012
rangarasan's Avatar
Registered User
 
Join Date: Jul 2011
Location: Chennai, India
Posts: 484
Thanks: 9
Thanked 119 Times in 115 Posts
awk

Hi,
You can shortend a bit..
Try this one,

Code:
awk 'BEGIN{FS=OFS="|";}{if(a[$1] !~ $2&&a[$1]){a[$1]=a[$1]","$2;}if(!a[$1]){a[$1]=$2;}}END{for(i in a){print i,a[i];}}' file

a[$1] - its an array and using the first column as an index..
Cheers,
Ranga:-)

Last edited by rangarasan; 06-19-2012 at 04:57 AM..
Sponsored Links
    #5  
Old 06-19-2012
Registered User
 
Join Date: Dec 2011
Posts: 84
Thanks: 29
Thanked 0 Times in 0 Posts
Hi rangarasan !

Your code returns:


Code:
b
blue|red,red,green,red

Without the ";", it suppresses the first "b" but only list the second column:


Code:
blue|red,red,green,red

That's the role of


Code:
if(start && substr(a[$1],(start+RLENGTH),1) ~ /^[,]*$/) next

to skip the duplicates

Last edited by lucasvs; 06-19-2012 at 02:05 AM..
Sponsored Links
    #6  
Old 06-19-2012
rangarasan's Avatar
Registered User
 
Join Date: Jul 2011
Location: Chennai, India
Posts: 484
Thanks: 9
Thanked 119 Times in 115 Posts
I have updated in my previous post. Pls check.
Sponsored Links
    #7  
Old 06-19-2012
Registered User
 
Join Date: Dec 2011
Posts: 84
Thanks: 29
Thanked 0 Times in 0 Posts
It leaves the first ",":


Code:
blue|,red,green

---------- Post updated at 01:54 AM ---------- Previous update was at 01:43 AM ----------

However, elixir_sinari's code works for a file with 2 columns.
If more than 2 columns, the code will skip the columns >2.

As I don't know how many columns a file can contain, is there a way to do the same independently of the number of columns?

I mean, one single code to treat different intputs like (I changed the words by numbers in this example to be simpler):

* file1.tab:

Code:
1|2|3|4
1|1|3|4
1|2|3|3

output1.tab:

Code:
1|1,2|3|3,4

* file2.tab:

Code:
1|2
1|3
7|2

output2.tab:

Code:
1,7|2,3

* file3.tab:

Code:
1|2|7|9|5|8
1|2|3|9|5|6

output.3tab

Code:
1|2|3,7|9|5|6,8

(NB: the order of numbers separated by "," doesn't matter)
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Compare two files based on values of fields. Hangman2 Shell Programming and Scripting 4 10-21-2010 10:43 AM
Compare two files using awk or sed, add values in a column if their previous fields are same yerruhari Shell Programming and Scripting 3 11-08-2009 09:53 PM
Compare two files using awk or sed, add values in a column if their previous fields are same yerruhari UNIX for Dummies Questions & Answers 1 11-07-2009 07:52 AM
Compare two files using awk or sed, add values in a column if their previous fields are same yerruhari UNIX for Advanced & Expert Users 1 11-07-2009 07:50 AM
How to read and compare multiple fields in a column at the same time ahjiefreak Shell Programming and Scripting 1 06-19-2008 11:08 AM



All times are GMT -4. The time now is 05:03 AM.