Hi, a brief introduction on the soundex python module(english sound comparison):
output:
Suppose I want to merge two files, called mergeleft.csv and mergeright.csv
Mergeleft.csv:
Mergeright.csv:
These files are sorted first by zip then by name.
I want to merge the two files first by comparing zip code, and if the zip code match, for example, 10007, then compare the each of the name in mergeleft with each of the name in mergeright within the same zip code using soundex.sound_similar()
For exmaple,
in mergeleft.csv the zip code of neu york yanke is 10007
therefore ,"neu york yanke" would be compared with each of the 10007 zip-coded name in mergeright.csv, namely:
If the soundex.sound_similar() output is 1, then merge the two lines. And if there is more than one possible match within the same zipcode group, assign 1 to the duplicate_flag without losing data.
desired output.csv:
Last edited by zaxxon; 10-28-2009 at 08:27 AM..
Reason: use code tags
So I have a perl script that prompts the user to enter either q or Q to exit the program or c to continue said program. If the user inputs anything other than those three keys they will be prompted again and again for an appropriate input. My script works for the most part except for one small... (6 Replies)
Dear Gents,
Please I need your help... I need small script :) to do the following.
I have a thousand of files in a folder produced daily.
I need first to merge all files called. txt (0009.txt, 0010.txt, 0011.txt) and and to output a resume of all information on 2 separate files in csv... (14 Replies)
Hi,
My requirement is,there is a directory location like:
:camp/current/
In this location there can be different flat files that are generated in a single day with same header and the data will be different, differentiated by timestamp, so i need to verify how many files are generated... (10 Replies)
Hello Ya'all:
I hope Zaxxon is still around. I read a posting about compiling/updating the kernel from source. I'm doing a very specific upgrade, and am wondering if there is anything different or if there's an easy way to do this: I am using kernel version 2.6.18-92, and have done some... (1 Reply)
Hi all,
So I have a script that reads a file called FILEA.txt and in that file there are several columns. The ones that are most important are the $name $start and $stop. So currently the script takes values between the start and stop (inside) by using a program called fastamd. But what I... (4 Replies)
hi,
i am facing a problem in merging two files using awk,
the problem is as stated below,
file1:
A|B|C|D|E|F|G|H|I|1
M|N|O|P|Q|R|S|T|U|2
AA|BB|CC|DD|EE|FF|GG|HH|II|1
....
....
....
file2 :
1|Mn|op|qr (2 Replies)
Ok so I have a file which contains 2 columns/fields and I have another file with 2 columns. The files look like:
file1:
1 33
5 345
18 2
45 1
78 31
file2:
1 c1d2t0
2 c1d3t0
3 c1d4t0
4 c1d4t0
5 c2d1t0
6 c2d1t0
7 c2d1t0
8 c2d1t0
9 c2d1t0
10 c2d1t0 (11 Replies)
I am comparing two files which are identical except for the timestamp which is incorporated within the otherwise same 372 bytes. I am using the command:
cmp -s $Todays_file $Yesterdays_file -i 372
When I run the command without the -i 372 it shows the difference i.e. the timestamp.... (5 Replies)
I used %H%M for hours and minutes within a date variable, to latch the date/time onto the end of a file, the script it was in is now under SCCS control and the %H% is a predefined parameter for SCCS, so it tags a date with a "/" character in it.
Is there a way to tell SCCS to ignore anything... (0 Replies)
having a slight problem. any clues would help. Can't seem to get any output when I run a simple echo script.
grex.cyberspace.org% chmod a+x test
grex.cyberspace.org% ls -l test
-rwxrwx--x 1 gordybh cohorts 20 Dec 13 20:22 test
grex.cyberspace.org% cat test
#!/bin/sh
echo test... (2 Replies)