Hi, a brief introduction on the soundex python module(english sound comparison):
Code:
import soundex.py
a = "neu yorkk"
b = "new york city"
print soundex.sound_similar(a, b)
output:
Code:
1
Suppose I want to merge two files, called mergeleft.csv and mergeright.csv
Mergeleft.csv:
Code:
gamewin,zip,name
90,10007,neu york yanke
20,10007,new york met
44,10007,manhatten policemens
24,10007,manhatten policemen
64,20005,dc metros
34,20005,dc eagles
Mergeright.csv:
Code:
color,zip,name
blue,10007,new york yankee
yellow,10007,new yorkk mets
red,10007,manhattan's policeman
white,10007,manhattan's policeman
red,20003,philly dog
blue,20005,dc metro
green,20005,dc eagle
These files are sorted first by zip then by name.
I want to merge the two files first by comparing zip code, and if the zip code match, for example, 10007, then compare the each of the name in mergeleft with each of the name in mergeright within the same zip code using soundex.sound_similar()
For exmaple,
in mergeleft.csv the zip code of neu york yanke is 10007
therefore ,"neu york yanke" would be compared with each of the 10007 zip-coded name in mergeright.csv, namely:
Code:
blue,10007,new york yankee
yellow,10007,new yorkk mets
red,10007,manhattan's policeman
white,10007,manhattan's policeman
If the soundex.sound_similar() output is 1, then merge the two lines. And if there is more than one possible match within the same zipcode group, assign 1 to the duplicate_flag without losing data.
So I have a perl script that prompts the user to enter either q or Q to exit the program or c to continue said program. If the user inputs anything other than those three keys they will be prompted again and again for an appropriate input. My script works for the most part except for one small... (6 Replies)
Dear Gents,
Please I need your help... I need small script :) to do the following.
I have a thousand of files in a folder produced daily.
I need first to merge all files called. txt (0009.txt, 0010.txt, 0011.txt) and and to output a resume of all information on 2 separate files in csv... (14 Replies)
Hi,
My requirement is,there is a directory location like:
:camp/current/
In this location there can be different flat files that are generated in a single day with same header and the data will be different, differentiated by timestamp, so i need to verify how many files are generated... (10 Replies)
Hello Ya'all:
I hope Zaxxon is still around. I read a posting about compiling/updating the kernel from source. I'm doing a very specific upgrade, and am wondering if there is anything different or if there's an easy way to do this: I am using kernel version 2.6.18-92, and have done some... (1 Reply)
Hi all,
So I have a script that reads a file called FILEA.txt and in that file there are several columns. The ones that are most important are the $name $start and $stop. So currently the script takes values between the start and stop (inside) by using a program called fastamd. But what I... (4 Replies)
hi,
i am facing a problem in merging two files using awk,
the problem is as stated below,
file1:
A|B|C|D|E|F|G|H|I|1
M|N|O|P|Q|R|S|T|U|2
AA|BB|CC|DD|EE|FF|GG|HH|II|1
....
....
....
file2 :
1|Mn|op|qr (2 Replies)
Ok so I have a file which contains 2 columns/fields and I have another file with 2 columns. The files look like:
file1:
1 33
5 345
18 2
45 1
78 31
file2:
1 c1d2t0
2 c1d3t0
3 c1d4t0
4 c1d4t0
5 c2d1t0
6 c2d1t0
7 c2d1t0
8 c2d1t0
9 c2d1t0
10 c2d1t0 (11 Replies)
I am comparing two files which are identical except for the timestamp which is incorporated within the otherwise same 372 bytes. I am using the command:
cmp -s $Todays_file $Yesterdays_file -i 372
When I run the command without the -i 372 it shows the difference i.e. the timestamp.... (5 Replies)
I used %H%M for hours and minutes within a date variable, to latch the date/time onto the end of a file, the script it was in is now under SCCS control and the %H% is a predefined parameter for SCCS, so it tags a date with a "/" character in it.
Is there a way to tell SCCS to ignore anything... (0 Replies)
having a slight problem. any clues would help. Can't seem to get any output when I run a simple echo script.
grex.cyberspace.org% chmod a+x test
grex.cyberspace.org% ls -l test
-rwxrwx--x 1 gordybh cohorts 20 Dec 13 20:22 test
grex.cyberspace.org% cat test
#!/bin/sh
echo test... (2 Replies)