Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 07-28-2012
Banned
 
Join Date: May 2012
Posts: 186
Thanks: 12
Thanked 1 Time in 1 Post
Kindly check:remove duplicates with similar data in front of it

Hi all,

I have 2 files containing data like this:


Quote:
1 x
1 x
2 y
2 z
3 s
3 s
3 s
4 g
4 h
5 i
6 k
7 y
Quote:
1 x y z
1 x y z
2 y h f
2 z s k
3 s
3 s
3 s
4 g
4 h
5 i
6 k
7 y
so if there is same entry repeated in the column like1,2,3,4
I have to check if there is different entries column like 2,4
but similar entries for duplicatein column 2 like1,3

the output shuld be like this for first file

Quote:
1 x
2 y,z
3 s
4 g,h
5 i
6 k
7 y
Please let me know scripting regarding this.

In the same way for second file as well if data in colmn 2 is diferent print for duplicate entries arranged it like this
Quote:
1 x y z
2 y,z h,s f,k

Last edited by manigrover; 07-28-2012 at 03:52 AM..
Sponsored Links
    #2  
Old 07-28-2012
rangarasan's Avatar
Registered User
 
Join Date: Jul 2011
Location: Chennai, India
Posts: 484
Thanks: 9
Thanked 119 Times in 115 Posts
awk

Hi,

Try this one,

Code:
awk '{t=$0;r=$1" ";sub(r,"",t);if(a[$1]!~t){a[$1]=a[$1]" "t;}else{if(!a[$1]){a[$1]=t;}}}END{for(i in a){print i,a[i];}}' file1

It will work for both the files. I have not yet tested this.
Do you want combine these two files and do the rest?
Cheers,
Ranga:-)
Sponsored Links
    #3  
Old 07-28-2012
Banned
 
Join Date: May 2012
Posts: 186
Thanks: 12
Thanked 1 Time in 1 Post
Request to check

Hi

Thanks a lot Ranga

it has worked with the first file but nor with tthe second file

I dont have to combine both files

I have run separately

it has worked with first file but not with second

and it shows some sort of error like this, u might not able to understand because values are not like 1,2,3 and xyz as mentione din input but it follow the same pattern.there seems a littile error. Kindly check it

Quote:
bash-3.2$ awk '{t=$0;r=$1" ";sub(r,"",t);if(a[$1]!~t){a[$1]=a[$1]" "t;}else{if(!a[$1]){a[$1]=t;}}}END{for(i in a){print i,a[i];}}' saradrugbankdrug.txt >saradrugbankdrugnewlist.txt
awk: (FILENAME=saradrugbankdrug.txt FNR=132) fatal: Invalid range end: /PDE3B (5r)-6-(4-{[2-(3-Iodobenzyl)-3-Oxocyclohex-1-En-1-Yl]Amino}Phenyl)-5-Methyl-4,5-Dihydropyridazin-3(2h)-One Not Available T2D,CD,T1D/
bash-3.2$
Moderator's Comments:
Please use code tags instead of quote tags

Last edited by Scrutinizer; 07-28-2012 at 07:27 AM..
    #4  
Old 07-28-2012
rangarasan's Avatar
Registered User
 
Join Date: Jul 2011
Location: Chennai, India
Posts: 484
Thanks: 9
Thanked 119 Times in 115 Posts
awk

Hi,
The input file has some pattern match related characters like []. I have not tested the below code. Make a try with this.

Code:
awk '{$0=gensub(/([\]\[\(\)\{\}])/,"\\\1","g",$0);t=$0;r=$1"";sub(r,"",t);if(a[$1]!~t){a[$1]=a[$1]""t;}else{if(!a[$1]){a[$1]=t;}}}END{for(i in a){print i,a[i];}}' file1

you have to escape the special characters before going to use those in regex. You can also use quotemeta function in perl and then pass those output lines to awk.
Cheers,
Ranga:-)

Last edited by rangarasan; 07-28-2012 at 07:40 AM.. Reason: add perl func name
Sponsored Links
    #5  
Old 07-28-2012
Banned
 
Join Date: May 2012
Posts: 186
Thanks: 12
Thanked 1 Time in 1 Post
Request to check

Thankyouvery much
I want to write many!!
Sponsored Links
    #6  
Old 07-28-2012
rangarasan's Avatar
Registered User
 
Join Date: Jul 2011
Location: Chennai, India
Posts: 484
Thanks: 9
Thanked 119 Times in 115 Posts
Please use code tags to wrap your post so that future user's will get benefit:-)
Cheers,
Ranga:-)
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Request to check:remove duplicates only in first column manigrover Shell Programming and Scripting 20 08-01-2012 05:56 AM
Request to check:remove duplicates and write sytematically manigrover Shell Programming and Scripting 2 07-24-2012 11:18 AM
Request to check remove duplicates but write before it manigrover Shell Programming and Scripting 4 07-18-2012 09:08 AM
Request to check:Remove duplicates manigrover Shell Programming and Scripting 4 07-18-2012 08:40 AM
remove space in front or end of each field happyv Shell Programming and Scripting 6 03-22-2007 02:05 AM



All times are GMT -4. The time now is 04:03 AM.