The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
Google UNIX.COM


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
count no of words in a line satish@123 Shell Programming and Scripting 7 05-20-2008 11:59 PM
Need to identify the line containing @ in between the line of a file b.paramanatti UNIX for Dummies Questions & Answers 4 11-04-2007 06:50 PM
seperating the words from a line?? skyineyes Shell Programming and Scripting 3 06-26-2007 06:00 AM
removing line and duplicate line ocelot UNIX for Dummies Questions & Answers 11 01-30-2007 08:44 AM
Duplicate words zulander UNIX for Dummies Questions & Answers 1 04-01-2001 12:11 AM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 04-27-2007
Registered User
 

Join Date: Jan 2007
Posts: 28
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Exclamation Identify duplicate words in a line using command

Hi,
Let me explain the problem clearly:
Let the entries in my file be:
Code:
lion,tiger,bear
apple,mango,orange,apple,grape
unix,windows,solaris,windows,linux
red,blue,green,yellow
orange,maroon,pink,violet,orange,pink
Can we detect the lines in which one of the words(separated by field separator) occurs more than once, using a command (or command pipe)?
In this case, the command should detect the lines 2,3,5.

I accomplished it using a perl script (cited below), although i wonder whether this could be done through a command (the difficulty is that the no. of columns is not constant).

Perl program that I used:
Code:
$fname=<STDIN>;
chomp $fname;
open(file,"<$fname");
$found_dups=0;

for $line(<file>)
{
  chomp $line;
  @arr=split(/,/,$line);
  for($i=1;$i<=$#arr;$i++)
  {
     for($j=$i+1;$j<=$#arr;$j++)
     {
        if($arr[$i] eq $arr[$j])
        {
           print "tid $arr[0]\n";
           $found_dups++;
        }
     }
  }
}
print "Found $found_dups duplicates\n";
Thanks,
Srini
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 04-27-2007
Registered User
 

Join Date: Sep 2006
Posts: 1,402
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
If you have Python, here's a neater alternative:
Code:
#!/usr/bin/python
for line in open("file"):
    line = line.strip().split(",")
    if len(line) == len(set(line)):
        print "No change"
    else:
        print ','.join(line)
output:
Code:
# ./test.py
No change
apple,mango,orange,apple,grape
unix,windows,solaris,windows,linux
No change
orange,maroon,pink,violet,orange,pink
Reply With Quote
  #3 (permalink)  
Old 04-27-2007
awk awk is offline
Registered User
 

Join Date: Feb 2007
Posts: 110
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
awk -F, '{
for (I=1;I<NF;I++)
{
for (J=I+1;J<=NF;J++)
{
if ($I == $J ) { print $I": " $0 }
}
}
}' << ENDOFFILE
lion,tiger,bear
apple,mango,orange,apple,grape
unix,windows,solaris,windows,linux
red,blue,green,yellow
orange,maroon,pink,violet,orange,pink
ENDOFFILE
apple: apple,mango,orange,apple,grape
windows: unix,windows,solaris,windows,linux
orange: orange,maroon,pink,violet,orange,pink
pink: orange,maroon,pink,violet,orange,pink
Reply With Quote
  #4 (permalink)  
Old 04-27-2007
Registered User
 

Join Date: Jan 2007
Posts: 28
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Unhappy

Hi,
Thanx for the suggestions. I understand that the job can be done by different variations of scripts, but what I am eager about is "a single command/command pipe" which can do the job. If there are only specific number of entries in each line, i can manually compare them in command-line using awk/perl. But since I dont know the no. of entries in each line, the task is cumbersome.
I would be enlightened if I get a command pipe version of these scripts.

Thanks
Srini
Reply With Quote
  #5 (permalink)  
Old 04-30-2007
kahuna's Avatar
Registered User
 

Join Date: Apr 2007
Posts: 147
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Srini, I'm not sure I understand your reluctance to use the scripts posted. Having said that, you could try the script below. It is not very efficient but is short.
Code:
perl -nle 'print if /(^|,)([^,]+)(,|,.*,)\2(,|$)/;' <file
Reply With Quote
  #6 (permalink)  
Old 04-30-2007
Technorati Master
 

Join Date: Mar 2005
Location: Large scale systems...
Posts: 2,404
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Again with perl,
but much simpler

Code:
#! /opt/third-party/bin/perl

open(FILE, "<", "file") || die "Unable to open file <$!> \n";

while(chomp($var=<FILE>)) {
  @arr = split(/,/, $var);
  foreach(@arr) {
    if( exists $fileHash{$_} ) {
      print $var . "\n";
      last;
    }
    else {
      $fileHash{$_} = $i++;
    }
  }
  %fileHash = ();
}

close(FILE);

exit 0
Reply With Quote
  #7 (permalink)  
Old 04-30-2007
Ygor's Avatar
Moderator
 

Join Date: Oct 2003
Location: -31.96,115.84
Posts: 1,207
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Try...
Code:
$ grep -En '(^|,)([^,]+).*,\2($|,)' file
2:apple,mango,orange,apple,grape
3:unix,windows,solaris,windows,linux
5:orange,maroon,pink,violet,orange,pink
Reply With Quote
  #8 (permalink)  
Old 04-30-2007
Registered User
 

Join Date: Jan 2007
Posts: 28
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Wink

Hi,
I think I am mistaken.
Quote:
Originally Posted by kahuna
Srini, I'm not sure I understand your reluctance to use the scripts posted.
I had such a requirement when I posted this question, but it was my curiosity which drove me to ask you guyz about the "command line" version of what my perl program could do. It is not that I am neglecting the scripts posted, but only that I wanted to expand my knowledge on one-liners (I have a fascination towards one-liners).

I'm sorry if my deed was arrogant though.

Thanks,
Srini
Reply With Quote
  #9 (permalink)  
Old 04-30-2007
Registered User
 

Join Date: Jan 2007
Posts: 28
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Thumbs up Thanks

And the one-liners (perl/grep) you gave, work great (still I'm trying to decipher the logic behind them though).
Thanks for your help.

Thanks
Srini
Reply With Quote
Google UNIX.COM
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 04:56 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101