deleting dupes in a row


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting deleting dupes in a row
# 1  
Old 09-09-2012
deleting dupes in a row

Hello,
I have a large database in which name homonyms are arranged in a row. Since the database is large and generated by hand, very often dupes creep in. I want to remove the dupes either using an awk or perl script.
An input is given below
Quote:
mohammed=mohd=muhammed=mohammed=mohd=muhammed=md=muhmd
mahendra=mahendera=mahndra=mahendra=mahendera
The expected output is given below:
Quote:
md=mohammed=mohd=muhammed=muhmd
mahendera=mahendra=mahndra
As can be seen all the dupes are cleaned out.
At present I am using a macro which converts row to line, sorts and deleted dupes and restores the row structure. Since the database is huge, the macro takes a very long time.
Many thanks in advance for a speedy solution
# 2  
Old 09-09-2012
Code:
$ awk -F= '{for(i=1;i<=NF;i++){a[$i]++;}for(i in a){b=b"="i}{sub("=","",b);$0=b;b="";delete a}}1' OFS=\= input.txt
md=mohammed=mohd=muhammed=muhmd
mahndra=mahendra=mahendera

---------- Post updated at 10:09 AM ---------- Previous update was at 10:04 AM ----------

Code:
$ perl -F= -lane 'print join "=",keys %{{ map { $_ => 1 } @F}};' input.txt
muhammed=muhmd=md=mohammed=mohd
mahendra=mahendera=mahndra

This User Gave Thanks to itkamaraj For This Post:
# 3  
Old 09-09-2012
Many thanks for both the solutions. I will test them out and get back on the forum

---------- Post updated at 01:53 AM ---------- Previous update was at 01:47 AM ----------

Many thanks. Both solutions work perfectly.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

In php, Moving a new row to another table and deleting old row

Hi, I already succeed moving a new row to another table if the field from new row doesn't have the first word that I categorized (like: IRC blablabla, PTM blablabla, ADM blablabla, BS blablabla). But it can't delete the old row. Please help me with the script. my php script: INSERT INTO... (2 Replies)
Discussion started by: jazzyzha
2 Replies

2. Shell Programming and Scripting

Moving new row and deleting old row to another table

Hi, I want to move a new row to another table if the field from new row doesn't have the first word that I categorized (like: IRC blablabla, PTM blablabla, ADM blablabla, BS blablabla). I already use this script but doesn't work as I expected. CHECK_KEYWORD="$( mysql -uroot -p123456 smsd -N... (7 Replies)
Discussion started by: jazzyzha
7 Replies

3. Shell Programming and Scripting

Deleting row if all column values are a particular string

Hello, I have a very large file for which I would like to remove all rows for which the value of columns 2-5 is zero. For instance I would like this file: contig1, 0, 0, 0, 0 contig2, 1, 3, 5, 0 contig3, 0, 0, 0, 0 contig4, 0, 5, 6, 7 To become this file: contig2, 1, 3, 5,0 ... (17 Replies)
Discussion started by: mouchkam
17 Replies

4. Shell Programming and Scripting

Deleting a row based on fetched value of column

Hi, I have a file which consists of two columns but the first one can be varying in length like 123456789 0abcd 123456789 0abcd 4015 0 0abcd 5000 0abcd I want to go through the file reading each line, count the number of characters in the first column and delete... (2 Replies)
Discussion started by: swasid
2 Replies

5. Shell Programming and Scripting

Script for identifying and deleting dupes in a line

I am compiling a synonym dictionary which has the following structure Headword=Synonym1,Synonym2 and so on, with each synonym separated by a comma. As is usual in such cases manual preparation of synonyms results in repeating the synonym which results in dupes as in the example below:... (3 Replies)
Discussion started by: gimley
3 Replies

6. Shell Programming and Scripting

Insert row without deleting previous data using sed

Hello, I want to add a new row to a file to insert data without deleting the previous data there. Example: file a b c d Output a b newtext c (6 Replies)
Discussion started by: joseamck
6 Replies

7. UNIX for Dummies Questions & Answers

deleting a row if a certain column is below a certain number

How can you delete a row if a certain column is bigger than a certain number? I have the following input: 20080709 20081222 95750 1 0 0.02 94.88 20080709 20081222 95750 2 0 0.89 94.88 20080709 20081222 9575 1 0 0 94.88 20080709 20081222 9575 2 0 0 94.88 20080709 20081222 9587.5 1 0 0... (6 Replies)
Discussion started by: Pep Puigvert
6 Replies

8. UNIX for Dummies Questions & Answers

deleting a row if a certain column is below a certain number

How can you delete a row if a certain column is bigger than a certain number? I have the following input: 20080709 20081222 95750 1 0 0.02 94.88 20080709 20081222 95750 2 0 0.89 94.88 20080709 20081222 9575 1 0 0 94.88 20080709 20081222 9575 2 0 0 94.88 20080709 20081222 9587.5 1 0 0... (1 Reply)
Discussion started by: Pep Puigvert
1 Replies

9. Shell Programming and Scripting

deleting blank line and row containing certain words in single sed command

Hi Is it possible to do the following in a single command /usr/xpg4/bin/sed -e '/rows selected/d' /aemu/CALLAUTO/callauto.txt > /aemu/CALLAUTO/callautonew.txt /usr/xpg4/bin/sed -e '/^$/d' /aemu/CALLAUTO/callautonew.txt > /aemu/CALLAUTO/callauto_new.txt exit (1 Reply)
Discussion started by: aemunathan
1 Replies

10. Shell Programming and Scripting

Deleting all occurences of a duplicate row

Hi, I need to delete all occurences of the repeated lines from a file and retain only the lines that is not repeated elsewhere in the file. As seen below the first two lines are same except that for the string "From BaseLine" and "From SMS".I shouldn't consider the string "From SMS" and "From... (7 Replies)
Discussion started by: ragavhere
7 Replies
Login or Register to Ask a Question