![]() |
|
|
|
|
|||||||
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| count no of words in a line | satish@123 | Shell Programming and Scripting | 7 | 05-20-2008 11:59 PM |
| Need to identify the line containing @ in between the line of a file | b.paramanatti | UNIX for Dummies Questions & Answers | 4 | 11-04-2007 06:50 PM |
| seperating the words from a line?? | skyineyes | Shell Programming and Scripting | 3 | 06-26-2007 06:00 AM |
| removing line and duplicate line | ocelot | UNIX for Dummies Questions & Answers | 11 | 01-30-2007 08:44 AM |
| Duplicate words | zulander | UNIX for Dummies Questions & Answers | 1 | 04-01-2001 12:11 AM |
|
|
Submit Tools | LinkBack | Thread Tools | Display Modes |
|
|||
|
Hi,
Let me explain the problem clearly: Let the entries in my file be: Code:
lion,tiger,bear apple,mango,orange,apple,grape unix,windows,solaris,windows,linux red,blue,green,yellow orange,maroon,pink,violet,orange,pink In this case, the command should detect the lines 2,3,5. I accomplished it using a perl script (cited below), although i wonder whether this could be done through a command (the difficulty is that the no. of columns is not constant). Perl program that I used: Code:
$fname=<STDIN>;
chomp $fname;
open(file,"<$fname");
$found_dups=0;
for $line(<file>)
{
chomp $line;
@arr=split(/,/,$line);
for($i=1;$i<=$#arr;$i++)
{
for($j=$i+1;$j<=$#arr;$j++)
{
if($arr[$i] eq $arr[$j])
{
print "tid $arr[0]\n";
$found_dups++;
}
}
}
}
print "Found $found_dups duplicates\n";
Srini |
| Forum Sponsor | ||
|
|
|
|||
|
If you have Python, here's a neater alternative:
Code:
#!/usr/bin/python
for line in open("file"):
line = line.strip().split(",")
if len(line) == len(set(line)):
print "No change"
else:
print ','.join(line)
Code:
# ./test.py No change apple,mango,orange,apple,grape unix,windows,solaris,windows,linux No change orange,maroon,pink,violet,orange,pink |
|
|||
|
awk -F, '{
for (I=1;I<NF;I++) { for (J=I+1;J<=NF;J++) { if ($I == $J ) { print $I": " $0 } } } }' << ENDOFFILE lion,tiger,bear apple,mango,orange,apple,grape unix,windows,solaris,windows,linux red,blue,green,yellow orange,maroon,pink,violet,orange,pink ENDOFFILE apple: apple,mango,orange,apple,grape windows: unix,windows,solaris,windows,linux orange: orange,maroon,pink,violet,orange,pink pink: orange,maroon,pink,violet,orange,pink |
|
|||
|
Hi,
Thanx for the suggestions. I understand that the job can be done by different variations of scripts, but what I am eager about is "a single command/command pipe" which can do the job. If there are only specific number of entries in each line, i can manually compare them in command-line using awk/perl. But since I dont know the no. of entries in each line, the task is cumbersome. I would be enlightened if I get a command pipe version of these scripts. Thanks Srini |
|
|||
|
Again with perl,
but much simpler Code:
#! /opt/third-party/bin/perl
open(FILE, "<", "file") || die "Unable to open file <$!> \n";
while(chomp($var=<FILE>)) {
@arr = split(/,/, $var);
foreach(@arr) {
if( exists $fileHash{$_} ) {
print $var . "\n";
last;
}
else {
$fileHash{$_} = $i++;
}
}
%fileHash = ();
}
close(FILE);
exit 0
|
|
|||
|
Hi,
I think I am mistaken. Quote:
I'm sorry if my deed was arrogant though. Thanks, Srini |