Removing all the duplicates

08-16-2011

Registered User

324, 12

Join Date: Jul 2011

Last Activity: 27 September 2015, 12:56 AM EDT

Posts: 324

Thanks Given: 80

Thanked 12 Times in 12 Posts

Removing all the duplicates

i want to remove all the duplictaes in a file.I dont want even a single entry.

For the input data:

Code:

12345|12|34
12345|13|23
3456|12|90
15670|12|13
12345|10|14
3456|12|13

i need the below data in one file

Code:

15670|12|13

and the below data in another file

Code:

 
 12345|12|34
 12345|13|23
 12345|10|14 
 3456|12|90 
 3456|12|13

I am identifying duplictaes based on first field alone.

if use sort -t"|" -u -k 1,1 it gives

Code:

 
12345|10|14
15670|12|13
3456|12|13

But i dont want the single entry too.

Please help me.

And also if i wnat to sort based on 10th field, can i use sort -k10 or sort -k 10,10?

Whats the difference between those?

Thanks

pandeesh

View Public Profile for pandeesh

Find all posts by pandeesh

08-16-2011

Registered User

3,733, 1,154

Join Date: Apr 2009

Last Activity: 3 August 2016, 11:03 AM EDT

Posts: 3,733

Thanks Given: 7

Thanked 1,154 Times in 1,124 Posts

Try:

Code:

awk -F"|" '{a[$1]++;b[$1]=b[$1]?b[$1]"\n"$0:$0}END{for(i in a){if(a[i]==1){print b[i]>"file1"}else{print b[i]>"file2"}}}' input

It will create two files: file1 and file2.

bartus11

View Public Profile for bartus11

Find all posts by bartus11

08-16-2011

Registered User

324, 12

Join Date: Jul 2011

Last Activity: 27 September 2015, 12:56 AM EDT

Posts: 324

Thanks Given: 80

Thanked 12 Times in 12 Posts

But it's giving illegal statement near line 1, syntax error at line 1.
I am checking in SunOS

pandeesh

View Public Profile for pandeesh

Find all posts by pandeesh

08-16-2011

Registered User

3,149, 702

Join Date: Apr 2010

Last Activity: 10 July 2019, 11:33 PM EDT

Posts: 3,149

Thanks Given: 46

Thanked 702 Times in 677 Posts

use nawk

itkamaraj

View Public Profile for itkamaraj

Find all posts by itkamaraj

08-16-2011

Registered User

324, 12

Join Date: Jul 2011

Last Activity: 27 September 2015, 12:56 AM EDT

Posts: 324

Thanks Given: 80

Thanked 12 Times in 12 Posts

Yes with nawk its working.But i want to make 10th field as key field.so what i need to change in that script?
shall i replace $1 by $10?

Thanks

pandeesh

View Public Profile for pandeesh

Find all posts by pandeesh

08-16-2011

Registered User

3,733, 1,154

Join Date: Apr 2009

Last Activity: 3 August 2016, 11:03 AM EDT

Posts: 3,733

Thanks Given: 7

Thanked 1,154 Times in 1,124 Posts

Quote:

Originally Posted by pandeesh

Yes with nawk its working.But i want to make 10th field as key field.so what i need to change in that script?
shall i replace $1 by $10?

Thanks

Yes.

bartus11

View Public Profile for bartus11

Find all posts by bartus11

08-16-2011

Registered User

324, 12

Join Date: Jul 2011

Last Activity: 27 September 2015, 12:56 AM EDT

Posts: 324

Thanks Given: 80

Thanked 12 Times in 12 Posts

I have changed like

Code:

 
awk -F"|" '{a[$10]++;b[$10]=b[$10]?b[$10]"\n"$0:$0}END{for(i in a){if(a[i]==1){print b[i]>"file1"}else{print b[i]>"file2"}}}' input

But its not giving correct result.
Anything else i need to change?

Thanks

---------- Post updated at 02:11 PM ---------- Previous update was at 02:02 PM ----------

In the file1 i am getting unique records.

But in file2 i am getting all the records.

From the below code anything else i need to change for making 10th field as key?

Code:

awk -F"|" '{a[$10]++;b[$10]=b[$10]?b[$10]"\n"$0:$0}END{for(i in a){if(a[i]==1){print b[i]>"file1"}else{print b[i]>"file2"}}}' input

I have tried $(10) too.

Please help me.. thanks

pandeesh

View Public Profile for pandeesh

Find all posts by pandeesh

Emergency UNIX and Linux Support

Removing all the duplicates

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing duplicates from new file

Discussion started by: sagar_1986

2. Shell Programming and Scripting

Removing duplicates except the last occurrence

Discussion started by: mechvijays

3. UNIX for Dummies Questions & Answers

Removing duplicates from a file

Discussion started by: Sri3001

4. Shell Programming and Scripting

Help in removing duplicates

Discussion started by: rkrish

5. Shell Programming and Scripting

Removing duplicates

Discussion started by: gctex

6. UNIX for Advanced & Expert Users

removing duplicates.

Discussion started by: raju4u

7. Shell Programming and Scripting

Removing duplicates

Discussion started by: imdadulla

8. Shell Programming and Scripting

removing duplicates

Discussion started by: stevie_velvet

9. UNIX for Dummies Questions & Answers

removing duplicates and sort -k

Discussion started by: orahi001

10. Shell Programming and Scripting

Removing duplicates

Discussion started by: giannicello