Help with removing duplicate entries with awk or Perl


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with removing duplicate entries with awk or Perl
# 1  
Old 10-29-2012
Help with removing duplicate entries with awk or Perl

Hi,
I have a file which looks like:ke this : chr1 11127067 11132181 89 chr1 11128023 11128311 chr1 11130990 11131025 chr1 11127067 11132181 89 chr1 11128023 11128311 chr1 11131583 11131618 chr1 11127067 11132181 89 chr1 11131908 11132010 chr1 11130990 11131025 chr1 11127067 11132181 89 chr1 11131908 11132010 chr1 11131583 11131618 chr1 11127067 11132181 89 chr1 11130992 11131108 chr1 11130990 11131025 chr1 11127067 11132181 89 chr1 11130992 11131108 chr1 11131583 11131618





















the expected output should be like this
chr1 11127067 11132181 89 chr1 11128023 11128311 chr1 11130990 11131025
I want all the duplicate lines to be removed from all the columns





and I want an output which should be able to remove duplicate entries from all the columns....

Last edited by Amit Pande; 10-29-2012 at 11:35 AM..
# 2  
Old 10-29-2012
Please use code tags to preserve formatting in the data samples. Your input is barely readable.
That one line? or multiple lines?
What's the expected output?

Last edited by elixir_sinari; 10-29-2012 at 11:35 AM.. Reason: eek...typo!
This User Gave Thanks to elixir_sinari For This Post:
# 3  
Old 10-29-2012
the duplicate entries in all the columns should be removed...and sorry for the bad post
# 4  
Old 10-29-2012
can you see the below post and use CODE tag

https://www.unix.com/how-post-unix-li...code-tags.html
This User Gave Thanks to itkamaraj For This Post:
# 5  
Old 10-29-2012
Like this?
Code:
sed 's/chr1/\
&/g' file|awk 'NF{
a=""
for(i=1;i<=NF;i++)
 a=a " " $i
if(!(a in exists))
{
 print
 exists[a]
}
}'|paste -sd\\0 -


Last edited by elixir_sinari; 10-29-2012 at 11:59 AM..
# 6  
Old 10-29-2012
Thanks but this doesn't work.
all the duplicate entries in all the columns should be removed....
# 7  
Old 10-29-2012
Amit,

can you post your data with code tag. otherwise, we all give the solutions with some assumption.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing duplicate entries from edge-lists

I have a file which has connections given as: A B 0.1 B C 5.8 C B 5.8 E F 0.67 B A 0.1 A B and B A are same, so I want to remove one of them. Same with BC and CB. Desired output: A B 0.1 B C 5.8 E F 0.67 (2 Replies)
Discussion started by: Sanchari
2 Replies

2. Shell Programming and Scripting

How to delete duplicate entries without using awk command?

Hello.. I am trying to remove the duplicate entries in a log files and used the the below shell script to do the same. awk '!x++' <filename> Can I do without using the awk command and the regex? I do not want to start the search from the beginning of the line in the log file as it contains... (9 Replies)
Discussion started by: sandeepcm
9 Replies

3. Shell Programming and Scripting

Removing duplicate terms in a file

Hi everybody I have a .txt file that contains some assembly code for optimizing it i need to remove some replicated parts. for example I have:e_li r0,-1 e_li r25,-1 e_lis r25,0000 add r31, r31 ,r0 e_li r28,-1 e_lis r28,0000 add r31, r31 ,r0 e_li r28,-1 ... (3 Replies)
Discussion started by: Behrouzx77
3 Replies

4. Shell Programming and Scripting

Removing Dupes from huge file- awk/perl/uniq

Hi, I have the following command in place nawk -F, '!a++' file > file.uniq It has been working perfectly as per requirements, by removing duplicates by taking into consideration only first 3 fields. Recently it has started giving below error: bash-3.2$ nawk -F, '!a++'... (17 Replies)
Discussion started by: makn
17 Replies

5. Linux

Need awk script for removing duplicate records

I have log file having Traffic line 2011-05-21 15:11:50.356599 TCP (6), length: 52) 10.10.10.1.3020 > 10.10.10.254.50404: 2011-05-21 15:11:50.652739 TCP (6), length: 52) 10.10.10.254.50404 > 10.10.10.1.3020: 2011-05-21 15:11:50.652558 TCP (6), length: 89) 10.10.10.1.3020 >... (1 Reply)
Discussion started by: Rastamed
1 Replies

6. Post Here to Contact Site Administrators and Moderators

Removing or Merging some duplicate threads

I have made some threads that were identical and were about the same question :( I've made them in 3 forums , the moderator has moved and merged one of these threads. There is one thread left and it need to be merged or deleted. Is there any way I can delete it or merge it myself ? I have delete... (1 Reply)
Discussion started by: k.a.docpp
1 Replies

7. Shell Programming and Scripting

Command to remove duplicate lines with perl,sed,awk

Input: hello hello hello hello monkey donkey hello hello drink dance drink Output should be: hello hello monkey donkey drink dance (9 Replies)
Discussion started by: cola
9 Replies

8. Shell Programming and Scripting

Counting duplicate entries in a file using awk

Hi, I have a very big (with around 1 million entries) txt file with IPv4 addresses in the standard format, i.e. a.b.c.d The file looks like 10.1.1.1 10.1.1.1 10.1.1.1 10.1.2.4 10.1.2.4 12.1.5.6 . . . . and so on.... There are duplicate/multiple entries for some IP... (3 Replies)
Discussion started by: sajal.bhatia
3 Replies

9. Shell Programming and Scripting

Removing duplicate records from 2 files

Can anyone help me to removing duplicate records from 2 separate files in UNIX? Please find the sample records for both the files cat Monday.dat 3FAHP0JA1AR319226MOHMED ATEK 966504453742 SAU2010DE 3LNHL2GC6AR636361HEA DEUK CHOI 821057314531 KOR2010LE 3MEHM0JG7AR652083MUTLAB NAL-NAFISAH... (4 Replies)
Discussion started by: zooby
4 Replies

10. Linux

Need awk script for removing duplicate records

I have huge txt file having millions of trade data. For e.g Trade.txt (first 8 lines in the file is header info) COB_DATE,TRADE_ID,SOURCE_SYSTEM_TRADE_ID,TRADE_GROUP_ID, TRADE_TYPE,DEALER_NAME,EXTERNAL_COUNTERPARTY_ID, EXTERNAL_COUNTERPARTY_NAME,DB_COUNTERPARTY_ID,... (6 Replies)
Discussion started by: nmumbarkar
6 Replies
Login or Register to Ask a Question