Hi,
Let me explain the problem clearly:
Let the entries in my file be:
Code:
lion,tiger,bear
apple,mango,orange,apple,grape
unix,windows,solaris,windows,linux
red,blue,green,yellow
orange,maroon,pink,violet,orange,pink
Can we detect the lines in which one of the words(separated by field separator) occurs more than once, using a command (or command pipe)?
In this case, the command should detect the lines 2,3,5.
I accomplished it using a
perl script (cited below), although i wonder whether this could be done through a command (the difficulty is that the no. of columns is not constant).
Perl program that I used:
Code:
$fname=<STDIN>;
chomp $fname;
open(file,"<$fname");
$found_dups=0;
for $line(<file>)
{
chomp $line;
@arr=split(/,/,$line);
for($i=1;$i<=$#arr;$i++)
{
for($j=$i+1;$j<=$#arr;$j++)
{
if($arr[$i] eq $arr[$j])
{
print "tid $arr[0]\n";
$found_dups++;
}
}
}
}
print "Found $found_dups duplicates\n";
Thanks,
Srini