![]() |
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| << match syntax error | megh | SUN Solaris | 4 | 10-24-2008 04:33 AM |
| Patern Match Question on file names | prismtx | Shell Programming and Scripting | 1 | 10-15-2008 06:06 PM |
| Match words | moutaz1983 | Shell Programming and Scripting | 8 | 01-07-2008 06:26 AM |
| record match | pavan_test | UNIX for Dummies Questions & Answers | 1 | 01-27-2006 10:41 PM |
| Match and Extract | tushar_johri | UNIX for Dummies Questions & Answers | 4 | 07-05-2005 11:02 PM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
use python or awk to match names 'with error tolerance'
I think this is a very challenging problem I am facing and I have no idea how to deal with it
Suppose I have two csv files A.csv Toyota Camry,1998,blue Honda Civic,1999,blue B.csv Toyota Inc. Camry, 2000km Honda Corp Civic,1500km I want to generate C.csv Toyota Camry,1998,blue ,2000km Honda Civic,1999,blue,1500km The worst part of the task is that there needs to be error tolerance to deal with the variations in the company name 1.extra spaces 2.extra dots 3.phrases such as Inc, corp. Is this mission impossible? |
|
||||
|
Code:
#!/usr/bin/perl
open FH,"<a.csv";
while(<FH>){
chomp;
my @tmp=split(",",$_);
$hash{$tmp[0]}=$_;
}
close FH;
open FH,"<b.csv";
while(<FH>){
chomp;
my @tmp=split(",",$_,2);
$tmp[0]=~s/(Inc|Corp)\.* //;
$hash{$tmp[0]}.=",".$tmp[1];
}
for $key (keys %hash){
print $hash{$key},"\n";
}
|
|
||||
|
Thanks a lot for the reply, but is it possible to create manual translation tables:
Suppose the file is now A.csv Toyota Camry,1998,blue Honda Civic,1999,blue Acura Inf,2000,yellow B.csv Toyota Inc. Camry, 2000km Honda Corp Civic,1500km HondaUSA Inf, 2000, 2300km I want to generate C.csv Toyota Camry,1998,blue ,2000km Honda Civic,1999,blue,1500km HondaUSA Inf,2000,yellow,2300km How to generate a list of translation table which would say: Acura translates to HondaUSA |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|