Removing duplicate lines on first column based with pipe delimiter


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing duplicate lines on first column based with pipe delimiter
# 1  
Old 09-28-2015
Question Removing duplicate lines on first column based with pipe delimiter

Hi,

I have tried to remove dublicate lines based on first column with pipe delimiter . but i ma not able to get some uniqu lines


Command :
Code:
sort -t'|' -nuk1 file.txt

Input :
Code:
38376KZ|09/25/15|1.057
38376KZ|09/25/15|1.057
02006YB|09/25/15|0.859
12593PS|09/25/15|2.803
14041NL|09/25/15|1.415
02006JAB|09/25/15|0.214

Output;
Code:
38376KZ|09/25/15|1.057
12593PS|09/25/15|2.803
14041NL|09/25/15|1.415

But the output should be :
Code:
38376KZ|09/25/15|1.057
12593PS|09/25/15|2.803
14041NL|09/25/15|1.415
02006JAB|09/25/15|0.214

Can you please help me on thi why it is not working ?

Last edited by Franklin52; 09-28-2015 at 08:45 AM.. Reason: Please use code tags
# 2  
Old 09-28-2015
Please use code tags as required by forum rules!

I'm getting
Code:
02006YB|09/25/15|0.859
12593PS|09/25/15|2.803
14041NL|09/25/15|1.415
38376KZ|09/25/15|1.057

The fourth line in your desired output is a duplicate to the third input line due to the -n option.
A possible reason for the missing line is you have a <CR> char as a line terminator somewhere that causes overwriting a line.
This User Gave Thanks to RudiC For This Post:
# 3  
Old 09-28-2015
Hello parithi06,

Please use code tags while using commands/codes/Inputs in your posts as per forum rules. Now for your requirement, you have told us like you want to sort the file as per 1st column then if this is the case then output must be having line 02006YB|09/25/15|0.859 too, if this is NOT the case then please do let us know requirement more clearly. Following may help you in same.
Code:
awk 'FNR==NR{A[$1]=$0;next} ($1 in A){print $0;delete A[$1]}' Input_file Input_file

Output will be as follows.
Code:
38376KZ|09/25/15|1.057
02006YB|09/25/15|0.859
12593PS|09/25/15|2.803
14041NL|09/25/15|1.415
02006JAB|09/25/15|0.214

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace delimiter for a particular column in a pipe delimited file

I have an input file as below Emp1|FirstName|MiddleName|LastName|Address|Pincode|PhoneNumber 1234|FirstName1|MiddleName2|LastName3| Add1 || ADD2|123|000000000 Output : 1234|FirstName1|MiddleName2|LastName3| Add1 ,, ADD2|123|000000000 OR 1234,FirstName1,MiddleName2,LastName3, Add1 ||... (2 Replies)
Discussion started by: styris
2 Replies

2. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Hi, I have a similar input format- A_1 2 B_0 4 A_1 1 B_2 5 A_4 1 and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks! letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies

3. UNIX for Dummies Questions & Answers

awk solution to duplicate lines based on column

Hi experts, I have a tab-delimited file with one column containing values separated by a comma. I wish to duplicate the entire line for every value in that comma-delimited field. For example: $cat file 4444 4444 4444 4444 9990 2222,7777 6666 2222 ... (3 Replies)
Discussion started by: torchij
3 Replies

4. Shell Programming and Scripting

Removing duplicate records in a file based on single column explanation

I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1. https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html Can someone explain this command please? How are there no... (5 Replies)
Discussion started by: cokedude
5 Replies

5. Shell Programming and Scripting

Perl: filtering lines based on duplicate values in a column

Hi I have a file like this. I need to eliminate lines with first column having the same value 10 times. 13 18 1 + chromosome 1, 122638287 AGAGTATGGTCGCGGTTG 13 18 1 + chromosome 1, 128904080 AGAGTATGGTCGCGGTTG 13 18 1 - chromosome 14, 13627938 CAACCGCGACCATACTCT 13 18 1 + chromosome 1,... (5 Replies)
Discussion started by: polsum
5 Replies

6. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Hi, I want to remove duplicate records including the first line based on column1. For example inputfile(filer.txt): ------------- 1,3000,5000 1,4000,6000 2,4000,600 2,5000,700 3,60000,4000 4,7000,7777 5,999,8888 expected output: ---------------- 3,60000,4000 4,7000,7777... (5 Replies)
Discussion started by: G.K.K
5 Replies

7. UNIX for Dummies Questions & Answers

Removing Lines based on matching first column

I have a file1 that looks like this: File 1 a b b c c e d e and a file 2 that looks like this: File 2 b c e e Note that file 2 is the right hand column from file1. I want to remove any lines from file1 that begin with the column in file2. In this case the desired output... (6 Replies)
Discussion started by: kschiltz55
6 Replies

8. Shell Programming and Scripting

Removing Duplicate Lines per Section

Hello, I am in need of removing duplicate lines from within a file per section. File: ABC1 012345 header ABC2 7890-000 ABC3 012345 Header Table ABC4 ABC5 593.0000 587.4800 ABC5 593.5000 587.6580 <= dup need to remove ABC5 593.5000 ... (5 Replies)
Discussion started by: petersf
5 Replies

9. Shell Programming and Scripting

removing duplicate blank lines

Hi, how to remove the blank lines from the file only If we have more than one blank line. thanks rameez (8 Replies)
Discussion started by: rameezrajas
8 Replies

10. UNIX for Dummies Questions & Answers

removing duplicate lines from a file

Hi, I am trying to remove duplicate lines from a file. For example the contents of example.txt is: this is a test 2342 this is a test 34343 this is a test 43434 and i want to remove the "this is a test" lines only and end up with the numbers in the file, that is, end up with: 2342... (4 Replies)
Discussion started by: ocelot
4 Replies
Login or Register to Ask a Question