Identify duplicate words in a line using command


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Identify duplicate words in a line using command
Prev   Next
# 1  
Old 04-27-2007
Error Identify duplicate words in a line using command

Hi,
Let me explain the problem clearly:
Let the entries in my file be:
Code:
lion,tiger,bear
apple,mango,orange,apple,grape
unix,windows,solaris,windows,linux
red,blue,green,yellow
orange,maroon,pink,violet,orange,pink

Can we detect the lines in which one of the words(separated by field separator) occurs more than once, using a command (or command pipe)?
In this case, the command should detect the lines 2,3,5.

I accomplished it using a perl script (cited below), although i wonder whether this could be done through a command (the difficulty is that the no. of columns is not constant).

Perl program that I used:
Code:
$fname=<STDIN>;
chomp $fname;
open(file,"<$fname");
$found_dups=0;

for $line(<file>)
{
  chomp $line;
  @arr=split(/,/,$line);
  for($i=1;$i<=$#arr;$i++)
  {
     for($j=$i+1;$j<=$#arr;$j++)
     {
        if($arr[$i] eq $arr[$j])
        {
           print "tid $arr[0]\n";
           $found_dups++;
        }
     }
  }
}
print "Found $found_dups duplicates\n";

Thanks,
Srini
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regex to identify unique words in a dictionary database

Hello, I have a dictionary which I am building for the Open Source Community. The data structure is as under HEADWORD=PARTOFSPEECH=ENGLISH MEANING as shown in the example below अ=m=Prefix signifying negation. अँहँ=ind=Interjection expressing disapprobation. अं=int=An interjection... (2 Replies)
Discussion started by: gimley
2 Replies

2. Shell Programming and Scripting

Command line: add text wrapper around words

I am trying to build a sinkhole for BIND. I created a master zone file for malicious domains and created a separate conf file, but I am stuck. I have a list of known bd domains that is updated nightly. The file simply contains the list of domains, one on each line: Bad.com Bad2.com... (4 Replies)
Discussion started by: uuallan
4 Replies

3. Shell Programming and Scripting

Scripting help to identify words count in lines

Hi everybody, i have this biological situation to fix: > Id.1 ACGTACANNNNNNNNNNNACGTGCNNNNNNNACTGTGGT >Id.2 ACGGGT >Id.3 ACGTNNNNNNNNNNNNACTGGGGG >Id.4 ACGTGCGNNNNNNNNGGTCANNNNNNNNCGTGCAAANNNNN ........ .... These are nucleotidic sequences with some "NNNN..." always of the same... (4 Replies)
Discussion started by: Giorgio C
4 Replies

4. UNIX for Dummies Questions & Answers

help to identify duplicate columns adjacent value

Hi friends, I have a xlsheet like below first column having id ABCfollowed by 7digit numbers and the next column have title against the ids. Titles are unique and duplicateboth, but ids are unique even for duplicate title.Now I need to identify those duplicate title having the highest id for... (9 Replies)
Discussion started by: umapearl
9 Replies

5. Shell Programming and Scripting

how to identify duplicate columns in a row

Hi, How to identify duplicate columns in a row? Input data: may have 30 columns 9211480750 LK 120070417 920091030 9211480893 AZ 120070607 9205323621 O7 120090914 120090914 1420090914 2020090914 2020090914 9211479568 AZ 120070327 320090730 9211479571 MM 120070326 9211480892 MM 120070324... (3 Replies)
Discussion started by: suresh3566
3 Replies

6. Shell Programming and Scripting

How to set mutliple words variable from command line

I'm writing a script (C shell) to search for a pattern in file. For example scriptname pattern file1 file2 filenN I use for loop to loop through arguments argv, and it does the job if all arguments are supplied. However if only one argument is supplied (in that case pattern ) it should ask to... (5 Replies)
Discussion started by: patryk44
5 Replies

7. Shell Programming and Scripting

alias two words command line

Hello, i would like to alias aptitude install for sudo aptitude install, is it possible, and how ? i read the man alias page, but i think i have to use something with \ or { but i don't know exactly what. (3 Replies)
Discussion started by: harlock59
3 Replies

8. Shell Programming and Scripting

remove duplicate words in a line

Hi, Please help! I have a file having duplicate words in some line and I want to remove the duplicate words. The order of the words in the output file doesn't matter. INPUT_FILE pink_kite red_pen ball pink_kite ball yellow_flower white no white no cloud nine_pen pink cloud pink nine_pen... (6 Replies)
Discussion started by: sam_2921
6 Replies

9. UNIX for Dummies Questions & Answers

how to extend words on a command line ?

within a unix window, how do you setup your session to extend a word, by hitting the "esc" key twice. e.g. ls -la scri (esc key, esc key) thankyou (6 Replies)
Discussion started by: venhart
6 Replies

10. UNIX for Dummies Questions & Answers

overlapping words on command line

i tried resize command , but it's not working...... (4 Replies)
Discussion started by: gaurav123
4 Replies
Login or Register to Ask a Question