Remove duplicates, tried several programs not working


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Remove duplicates, tried several programs not working
# 1  
Old 11-28-2012
Remove duplicates, tried several programs not working

Hi all

I tried to remove duplicates using all code mentione dbelow but its not wroking it's seems busy please check it

Code:
sort -u filename.txt

Code:
perl -lne 'chomp; if(!defined $x{$_}){$x{$_}=1;print}' inputfile

out of attached files

log file(input)
log2file(output)
# 2  
Old 11-28-2012
Code:
awk '!arr[$0]++' logfile.txt

# 3  
Old 11-28-2012
I already tried this in my output still there are duplicates

AGTR1 is repeated DRD1 is repeated?Smilie

---------- Post updated at 06:43 AM ---------- Previous update was at 06:41 AM ----------

my output is

Code:
bash-3.2$ awk '!arr[$0]++' logfile.txt
DRD1
PLA2G1B
GRIN2A
TACR1
AGTR1
CNR1
PDE3A
SSTR2
IMPDH1
FYN
ADCY1
CHRNA7
SSTR4
NPY2R
TEP1
IL6
CCKAR
JAK3
GRM5
ERBB4
GRM3
LTB4R
ITGB1
GSK3B
BCL2L1
NRP1
CCR3
MYC
RAF1
PDGFRA
BCL2
HTR7
ITGB3
LEP
NFE2L2
THRB
AGT
CELSR2
GRIK2
GABRG3
FHIT 
PDE3A 
PTK2 
PLA2G1B 
DRD1 
MAN2A1 
GUCY1A2 
GSK3B 
PIM1 
ITGB3 
BCL2L1 
CCKAR 
BCAT1 
NRP1 
THRB 
HSP90AB1 
CMA1 
AGTR1 
CANT1 
TACR1 
ACTA1 
GRIK2 
CNR1

# 4  
Old 11-28-2012
I see some control characters which is the RC for your problem:-
Code:
awk '!arr[$0]++' logfile.txt | grep DRD1
DRD1
DRD1Â

Code:
awk '!arr[$0]++' logfile.txt | grep AGTR1
AGTR1
AGTR1Â

You have to get rid of them.
# 5  
Old 11-28-2012
I first time came to know about these control characers how to get rid of these!?Smilieany idea
# 6  
Old 11-28-2012
Try using sed:-
Code:
sed 's/[^a-zA-Z0-9]//g' logfile.txt > new_logfile.txt

# 7  
Old 11-28-2012
Request to check

Hi still is shows the same problemd

AGTR1

DRD1

still repeated

May be more repeated!

Mani
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to remove duplicates using for loop?

values=(1 2 3 5 4 2 3 1 6 8 3 5 ) #i need the output like this by removing the duplicates 1 2 3 5 4 6 8 #i dont need sorting in my program #plz explain me as simple using for loop #os-ubuntu ,shell=bash (5 Replies)
Discussion started by: Meeran Rizvi
5 Replies

2. Shell Programming and Scripting

Remove duplicates

Hi I have a below file structure. 200,1245,E1,1,E1,,7611068,KWH,30, ,,,,,,,, 200,1245,E1,1,E1,,7611070,KWH,30, ,,,,,,,, 300,20140223,0.001,0.001,0.001,0.001,0.001 300,20140224,0.001,0.001,0.001,0.001,0.001 300,20140225,0.001,0.001,0.001,0.001,0.001 300,20140226,0.001,0.001,0.001,0.001,0.001... (1 Reply)
Discussion started by: tejashavele
1 Replies

3. Shell Programming and Scripting

Remove duplicates

I have a file with the following format: fields seperated by "|" title1|something class|long...content1|keys title2|somhing class|log...content1|kes title1|sothing class|lon...content1|kes title3|shing cls|log...content1|ks I want to remove all duplicates with the same "title field"(the... (3 Replies)
Discussion started by: dtdt
3 Replies

4. UNIX for Dummies Questions & Answers

Remove duplicates from a file

Can u tell me how to remove duplicate records from a file? (11 Replies)
Discussion started by: saga20
11 Replies

5. Shell Programming and Scripting

Remove console messages of background programs

Hi all ! If I run Xterm in backgprund mode &, when it stops I get this annoying messages on console: Exit 15 xterm ... Exit 15 xterm ... How do can I remove this kind of ? Thanks ! (3 Replies)
Discussion started by: jerold
3 Replies

6. Shell Programming and Scripting

Script to remove duplicates

Hi I need a script that removes the duplicate records and write it to a new file for example I have a file named test.txt and it looks like abcd.23 abcd.24 abcd.25 qwer.25 qwer.26 qwer.98 I want to pick only $1 and compare with the next record and the output should be abcd.23... (6 Replies)
Discussion started by: antointoronto
6 Replies

7. Shell Programming and Scripting

working with other programs in perl

Hi, I'm still new to perl, and I'm trying to figure out how to work with data output from another program. For example, from a command line I can run "foo -xyz" and it will produce the output I am looking for, which is several lines of text that I will then parse and manipulate within the perl... (6 Replies)
Discussion started by: Decoy
6 Replies

8. Shell Programming and Scripting

Remove duplicates

Hello Experts, I have two files named old and new. Below are my example files. I need to compare and print the records that only exist in my new file. I tried the below awk script, this script works perfectly well if the records have exact match, the issue I have is my old file has got extra... (4 Replies)
Discussion started by: forumthreads
4 Replies

9. UNIX for Dummies Questions & Answers

How to remove duplicates without sorting

Hello, I can remove duplicate entries in a file by: sort File1 | uniq > File2 but how can I remove duplicates without sorting the file? I tried cat File1 | uniq > File2 but it doesn't work thanks (4 Replies)
Discussion started by: orahi001
4 Replies

10. Shell Programming and Scripting

fastest way to remove duplicates.

I have searched the FAQ - by using sort, duplicates, etc.... but I didn't get any articles or results on it. Currently, I am using: sort -u file1 > file2 to remove duplicates. For a file size of 1giga byte approx. time taken to remove duplicates is 1hr 21 mins. Is there any other faster way... (15 Replies)
Discussion started by: radhika
15 Replies
Login or Register to Ask a Question