To remove duplicates from pipe delimited file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting To remove duplicates from pipe delimited file
# 8  
Old 10-20-2013
One more approach

Code:
$ awk -F'|' '{if (a[$1 FS $2]) next}a[$1 FS $2]=$0' file


Last edited by Akshay Hegde; 10-20-2013 at 10:24 AM..
# 9  
Old 10-20-2013
Quote:
Originally Posted by Akshay Hegde
One more approach

Code:
$ awk -F'|' '{if (a[$1 FS $2]) next}a[$1 FS $2]=$0' file

You misunderstood the problem. A correct solution must make more than a single pass over the data.

The simplest AWK solution would make two passes. The first determines key frequency. The second decrements each key's value and prints a record only when that value becomes zero.

Regards,
Alister

Last edited by alister; 10-20-2013 at 11:51 AM..
# 10  
Old 10-20-2013
Thanks Alister this will work as user requested in #1

Code:
$ awk -F'|' '{a[$1 FS $2]=$0}END{for (i in a) print a[i]}' file
431|yui|qwer|opws
123|asdf|pol|njio

# 11  
Old 10-20-2013
Quote:
Originally Posted by Akshay Hegde
Thanks Alister this will work as user requested in #1

Code:
$ awk -F'|' '{a[$1 FS $2]=$0}END{for (i in a) print a[i]}' file
431|yui|qwer|opws
123|asdf|pol|njio

Only if ginkrf doesn't care about the output line order being different from the input line order...

Code:
for (index in array) ...

is allowed to produce a random (unrelated to the order in which elements were added to the array and unrelated to the collation sequence or numeric sequence of the indices) output order.
# 12  
Old 10-20-2013
Thats true Don
# 13  
Old 10-21-2013
With too passes, and using minimal memory:
Code:
awk -F'|' '{k=$1 FS $2} NR==FNR {A[k]=NR; next} A[k]==FNR' file file


Last edited by MadeInGermany; 10-21-2013 at 05:15 AM.. Reason: simplified
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to remove new line characters from data rows in a Pipe delimited file?

I have a file as below Emp1|FirstName|MiddleName|LastName|Address|Pincode|PhoneNumber 1234|FirstName1|MiddleName2|LastName3| Add1 || ADD2|123|000000000 2345|FirstName2|MiddleName3|LastName4| Add1 || ADD2| 234|000000000 OUTPUT : ... (1 Reply)
Discussion started by: styris
1 Replies

2. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies

3. Shell Programming and Scripting

Removing duplicates from delimited file based on 2 columns

Hi guys,Got a bit of a bind I'm in. I'm looking to remove duplicates from a pipe delimited file, but do so based on 2 columns. Sounds easy enough, but here's the kicker... Column #1 is a simple ID, which is used to identify the duplicate. Once dups are identified, I need to only keep the one... (2 Replies)
Discussion started by: kevinprood
2 Replies

4. Shell Programming and Scripting

How to ignore Pipe in Pipe delimited file?

Hi guys, I need to know how i can ignore Pipe '|' if Pipe is coming as a column in Pipe delimited file for eg: file 1: xx|yy|"xyz|zzz"|zzz|12... using below awk command awk 'BEGIN {FS=OFS="|" } print $3 i would get xyz But i want as : xyz|zzz to consider as whole column... (13 Replies)
Discussion started by: rohit_shinez
13 Replies

5. Shell Programming and Scripting

Remove few columns from pipe delimited file

I have file as below column1|column2|column3|column4|column5| fill1|fill2|fill3|fill4|fill5| abc1|abc2|abc3|abc4|abc5| . . . . i need to remove column2,3, from that file column1|column4|column5| fill1|fill4|fill5| abc1|abc4|abc5| . . . (3 Replies)
Discussion started by: greenworld123
3 Replies

6. Shell Programming and Scripting

Help with converting Pipe delimited file to Tab Delimited

I have a file which was pipe delimited, I need to make it tab delimited. I tried with sed but no use cat file | sed 's/|//t/g' The above command substituted "/t" not tab in the place of pipe. Sample file: abc|123|2012-01-30|2012-04-28|xyz have to convert to: abc 123... (6 Replies)
Discussion started by: karumudi7
6 Replies

7. Shell Programming and Scripting

How to convert a space delimited file into a pipe delimited file using shellscript?

Hi All, I have space delimited file similar to the one as shown below.. I need to convert it as a pipe delimited, the values inside the pipe delimited file should be as highlighted... AA ATIU2345098809 009697 005374 BB ATIU2345097809 005445 006518 CC ATIU9685098809 003215 003571 DD... (7 Replies)
Discussion started by: nithins007
7 Replies

8. Shell Programming and Scripting

Remove SPACES between PIPE delimited file

This is my input file with extra information in the HEADER and leading & trailing SPACES between PIPE delimiter. 02/04/2010 Dynamic List Display 1 --------------------------------------------------------------------------------------... (6 Replies)
Discussion started by: srimitta
6 Replies

9. Shell Programming and Scripting

convert a pipe delimited file to a':" delimited file

i have a file whose data is like this:: osr_pe_assign|-120|wg000d@att.com|4| osr_evt|-21|wg000d@att.com|4| pe_avail|-21|wg000d@att.com|4| osr_svt|-11|wg000d@att.com|4| pe_mop|-13|wg000d@att.com|4| instar_ready|-35|wg000d@att.com|4| nsdnet_ready|-90|wg000d@att.com|4|... (6 Replies)
Discussion started by: priyanka3006
6 Replies

10. Shell Programming and Scripting

How to generate a pipe ( | ) delimited file?

:)Hi Friends, I have certain log files extracted. I want it to be converted in pipe ( | ) delimited file. How do i do it? E.g. Account Balance : 123456789 Rs O/P (Account Balance: | 123456789 Rs) Account Balance (Last) > 987654321 Rs O/P (Account Balance (Last) | 987654321 Rs) Last... (5 Replies)
Discussion started by: anushree.a
5 Replies
Login or Register to Ask a Question