The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com




View Single Post in the UNIX and Linux Forums - Click on the Thread or Permalink to View Entire Thread -->
  #1 (permalink)  
Old 04-27-2007
srinivasan_85 srinivasan_85 is offline
Registered User
  
 

Join Date: Jan 2007
Posts: 28
Exclamation Identify duplicate words in a line using command

Hi,
Let me explain the problem clearly:
Let the entries in my file be:
Code:
lion,tiger,bear
apple,mango,orange,apple,grape
unix,windows,solaris,windows,linux
red,blue,green,yellow
orange,maroon,pink,violet,orange,pink
Can we detect the lines in which one of the words(separated by field separator) occurs more than once, using a command (or command pipe)?
In this case, the command should detect the lines 2,3,5.

I accomplished it using a perl script (cited below), although i wonder whether this could be done through a command (the difficulty is that the no. of columns is not constant).

Perl program that I used:
Code:
$fname=<STDIN>;
chomp $fname;
open(file,"<$fname");
$found_dups=0;

for $line(<file>)
{
  chomp $line;
  @arr=split(/,/,$line);
  for($i=1;$i<=$#arr;$i++)
  {
     for($j=$i+1;$j<=$#arr;$j++)
     {
        if($arr[$i] eq $arr[$j])
        {
           print "tid $arr[0]\n";
           $found_dups++;
        }
     }
  }
}
print "Found $found_dups duplicates\n";
Thanks,
Srini