Removing duplicated lines??


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Removing duplicated lines??
# 1  
Old 02-14-2012
Removing duplicated lines??

Hi Guys..

I have a problem for some reason my database has copied everything 4 times.
My Database looks like this:
Code:
>BAC233456
rhjieaheiohjteo
tjtjrj6jkk6k6
j54ju54jh54jh
>ANI124365
afrhtjykulilil
htrjykuk
rtkjryky
ukrykyrk
>BAC233456
rhjieaheiohjteo
tjtjrj6jkk6k6
j54ju54jh54jh
>VIR46363
dgekjklk6rj6j6
>BAC233456
rhjieaheiohjteo
tjtjrj6jkk6k6
j54ju54jh54jh

What I need is that every single ">******" would be in the file once and this should look like this:

Code:
>BAC233456
rhjieaheiohjteo
tjtjrj6jkk6k6
j54ju54jh54jh
>ANI124365
afrhtjykulilil
htrjykuk
rtkjryky
ukrykyrk
>VIR46363
dgekjklk6rj6j6

Sorry, if it looks messy.
I tried sort command, for some reason it failed, so any other ways??

Thank you for any help.

Last edited by Franklin52; 02-15-2012 at 07:12 AM.. Reason: Please use code tags for code and data samples, thank you
# 2  
Old 02-14-2012
I tried copying the lines mentioned here in a text file and used sort. It worked.

You can use uniq to remove duplicate lines.
# 3  
Old 02-14-2012
Code:
awk '!a[$0]++' RS=">" ORS=">" yourfile

# 4  
Old 02-16-2012
Thank you Smilie both them worked great and i actually learned something Smilie
# 5  
Old 02-16-2012
@dinesh: I don't see how sort could work, wouldn't that mess up the order of the lines?
# 6  
Old 02-16-2012
Quote:
Originally Posted by dinesh1178
I tried copying the lines mentioned here in a text file and used sort. It worked.

You can use uniq to remove duplicate lines.
This may lead to erroneous behaviour : for example if a record contains a single line that already exists in a multi-line record, the singled line record will not be kept eventhough it is uniq from a record point of view.

Sorting and filtering with uniq would not fit with the requirements (as far as i understand them)
# 7  
Old 02-22-2012
Quote:
Originally Posted by Scrutinizer
@dinesh: I don't see how sort could work, wouldn't that mess up the order of the lines?
I thought order of line is not important, as requester also tried it. By seeing the reply, it seems sort worked fine for him.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Deleting lines containing duplicated strings

Dear all, I always appreciate your help. I would like to delete lines containing duplicated strings in the second column. test.txt 658 invert_d2e_q_reg_0_/Qalu_ecl_zlow_e 0.825692 659 invert_d2e_q_reg_0_/Qalu_byp_rd_data_e 0.825692 660 invert_d2e_q_reg_0_/Qalu_byp_rd_data_e 0.825692... (1 Reply)
Discussion started by: jypark22
1 Replies

2. UNIX for Dummies Questions & Answers

Removing PATTERN from txt without removing lines and general text formatting

Hi Everybody! First post! Totally noobie. I'm using the terminal to read a poorly formatted book. The text file contains, in the middle of paragraphs, hyphenation to split words that are supposed to be on multiple pages. It looks ve -- ry much like this. I was hoping to use grep -v " -- "... (5 Replies)
Discussion started by: AxeHandle
5 Replies

3. Shell Programming and Scripting

Removing duplicated first field rows

Hello, I am trying to eliminate rows where the first field is duplicated, leaving the row where the last field is "NET". Data file: 345234|22.34|LST 546543|55.33|LST 793929|98.23|LST 793929|64.69|NET 149593|49.22|LST Desired output: 345234|22.34|LST 546543|55.33|LST... (2 Replies)
Discussion started by: palex
2 Replies

4. Shell Programming and Scripting

How to remove duplicated lines?

Hi, if i have a file like this: Query=1 a a b c c c d Query=2 b b b c c e . . . (7 Replies)
Discussion started by: the_simpsons
7 Replies

5. Shell Programming and Scripting

awk to insert duplicated lines

Dear All, Suppose I have a file: 1 1 1 1 2 2 2 2 3 3 3 3I want to insert new line under each old line so that the file would become: 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3How can this be accomplished using awk (or sed)? (5 Replies)
Discussion started by: littlewenwen
5 Replies

6. Shell Programming and Scripting

awk to count duplicated lines

We have an input file as follows: 2010-09-15-12.41.15 2010-09-15-12.41.15 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.24 2010-09-15-12.41.25 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26 2010-09-15-12.41.26... (3 Replies)
Discussion started by: ux4me
3 Replies

7. UNIX for Dummies Questions & Answers

duplicated lines not recognized by sort and uniq

Hello all, I've got a strange behaviour of sort and uniq commands: they do not recognise apparently duplicated lines in a file (already sorted). The lines are identical by eye, but they must differ in smth, because when they are put in two files, those have slightly different size. What can make... (8 Replies)
Discussion started by: roussine
8 Replies

8. Shell Programming and Scripting

Help removing lines with duplicated columns

Hi Guys... Please Could you help me with the following ? aaaa bbbb cccc sdsd aaaa bbbb cccc qwer as you can see, the 2 lines are matched in three fields... how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ? Thanks (14 Replies)
Discussion started by: yahyaaa
14 Replies

9. Shell Programming and Scripting

remove duplicated lines without sort

Hi Just wondering whether or not I can remove duplicated lines without sort For example, I use the command who, which shows users who are logging on. In some cases, it shows duplicated lines of users who are logging on more than one terminal. Normally, I would do who | cut -d" " -f1 |... (6 Replies)
Discussion started by: lalelle
6 Replies
Login or Register to Ask a Question