Removing all the duplicates

 
Thread Tools Search this Thread
Homework and Emergencies Emergency UNIX and Linux Support Removing all the duplicates
# 1  
Old 08-16-2011
Removing all the duplicates

i want to remove all the duplictaes in a file.I dont want even a single entry.

For the input data:
Code:
12345|12|34
12345|13|23
3456|12|90
15670|12|13
12345|10|14
3456|12|13


i need the below data in one file

Code:
15670|12|13

and the below data in another file
Code:
 
 12345|12|34
 12345|13|23
 12345|10|14 
 3456|12|90 
 3456|12|13

I am identifying duplictaes based on first field alone.

if use sort -t"|" -u -k 1,1 it gives
Code:
 
12345|10|14
15670|12|13
3456|12|13

But i dont want the single entry too.

Please help me.

And also if i wnat to sort based on 10th field, can i use sort -k10 or sort -k 10,10?

Whats the difference between those?

Thanks
# 2  
Old 08-16-2011
Try:
Code:
awk -F"|" '{a[$1]++;b[$1]=b[$1]?b[$1]"\n"$0:$0}END{for(i in a){if(a[i]==1){print b[i]>"file1"}else{print b[i]>"file2"}}}' input

It will create two files: file1 and file2.
# 3  
Old 08-16-2011
But it's giving illegal statement near line 1, syntax error at line 1.
I am checking in SunOS
# 4  
Old 08-16-2011
use nawk
# 5  
Old 08-16-2011
Yes with nawk its working.But i want to make 10th field as key field.so what i need to change in that script?
shall i replace $1 by $10?

Thanks
# 6  
Old 08-16-2011
Quote:
Originally Posted by pandeesh
Yes with nawk its working.But i want to make 10th field as key field.so what i need to change in that script?
shall i replace $1 by $10?

Thanks
Yes.
# 7  
Old 08-16-2011
I have changed like

Code:
 
awk -F"|" '{a[$10]++;b[$10]=b[$10]?b[$10]"\n"$0:$0}END{for(i in a){if(a[i]==1){print b[i]>"file1"}else{print b[i]>"file2"}}}' input

But its not giving correct result.
Anything else i need to change?

Thanks

---------- Post updated at 02:11 PM ---------- Previous update was at 02:02 PM ----------

In the file1 i am getting unique records.

But in file2 i am getting all the records.

From the below code anything else i need to change for making 10th field as key?
Code:
awk -F"|" '{a[$10]++;b[$10]=b[$10]?b[$10]"\n"$0:$0}END{for(i in a){if(a[i]==1){print b[i]>"file1"}else{print b[i]>"file2"}}}' input

I have tried $(10) too.

Please help me.. thanks
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing duplicates from new file

i hav two files like i want to remove/delete all the duplicate lines in file2 which are viz unix,unix2,unix3 (2 Replies)
Discussion started by: sagar_1986
2 Replies

2. Shell Programming and Scripting

Removing duplicates except the last occurrence

Hi All, i have a file like below, @DB_FCTS\src\Data\Scripts\Delete_CU_OM_BIL_PRT_STMT_TYP.sql @DB_FCTS\src\Data\Scripts\Delete_CDP_BILL_LBL_MSG.sql @DB_FCTS\src\Data\Scripts\Delete_OM_BIDDR.sql @DB_FCTS\src\Data\Scripts\Insert_CU_OM_LBL_MSG.sql... (11 Replies)
Discussion started by: mechvijays
11 Replies

3. UNIX for Dummies Questions & Answers

Removing duplicates from a file

Hi All, I am merging files coming from 2 different systems ,while doing that I am getting duplicates entries in the merged file I,01,000131,764,2,4.00 I,01,000131,765,2,4.00 I,01,000131,772,2,4.00 I,01,000131,773,2,4.00 I,01,000168,762,2,2.00 I,01,000168,763,2,2.00... (5 Replies)
Discussion started by: Sri3001
5 Replies

4. Shell Programming and Scripting

Help in removing duplicates

I have an input file abc.txt with info like: abcd rateuse inklite robet rateuse abcd I need to remove duplicates from the file (eg: abcd,rateuse) from the file and need to place the contents in same file abc.txt if needed can be placed in another file. can anyone help me in this :( (4 Replies)
Discussion started by: rkrish
4 Replies

5. Shell Programming and Scripting

Removing duplicates

I have a test file with the following 2 columns: Col 1 | Col 2 T1 | 1 <= remove T5 | 1 T4 | 2 T1 | 3 T3 | 3 T4 | 1 <= remove T1 | 2 <= remove T3 ... (7 Replies)
Discussion started by: gctex
7 Replies

6. UNIX for Advanced & Expert Users

removing duplicates.

Hi All In unix ,we have a file ,there we have to remove the duplicates by using one specific column. Can any body tell me the command. ex: file1 id,name 1,ww 2,qwq 2,asas 3,asa 4,asas 4,asas o/p: 1,ww 2,qwq 3,asa (7 Replies)
Discussion started by: raju4u
7 Replies

7. Shell Programming and Scripting

Removing duplicates

Hi, I have a file in the below format., test test (10) to to (25) see see (45) and i need the output in the format of test 10 to 25 see 45 Some one help me? (6 Replies)
Discussion started by: imdadulla
6 Replies

8. Shell Programming and Scripting

removing duplicates

Hi I have a file that are a list of people & their credentials i recieve frequently The issue is that whne I catnet this list that duplicat entries exists & are NOT CONSECUTIVE (i.e. uniq -1 may not weork here ) I'm trying to write a scrip that will remove duplicate entries the script can... (5 Replies)
Discussion started by: stevie_velvet
5 Replies

9. UNIX for Dummies Questions & Answers

removing duplicates and sort -k

Hello experts, I am trying to remove all lines in a csv file where the 2nd columns is a duplicate. I am try to use sort with the key parameter sort -u -k 2,2 File.csv > Output.csv File.csv File Name|Document Name|Document Title|Organization Word Doc 1.doc|Word Document|Sample... (3 Replies)
Discussion started by: orahi001
3 Replies

10. Shell Programming and Scripting

Removing duplicates

Hi, I've been trying to removed duplicates lines with similar columns in a fixed width file and it's not working. I've search the forum but nothing comes close. I have a sample file: 27147140631203RA CCD * 27147140631203RA PPN * 37147140631207RD AAA 47147140631203RD JNA... (12 Replies)
Discussion started by: giannicello
12 Replies
Login or Register to Ask a Question