01-05-2013
You want similarity algorithms
Here is a good article explaining one approach (it talks about java):
How to Strike a Match
Levenshtein distance may be the most likely candidate for you:
Levenshtein distance - Wikipedia, the free encyclopedia
Here is perl module wordnet::similarity
WordNet::Similarity - search.cpan.org
You have to download this module and part of the parent module, too. It gives examples. You will have to work out your percentage calculation using results from a module like this one. Or roll your own (article 1 above). I would recommend doing some reading (above) before messing with this. Similairity algorithms can do interesting and sometimes confusing things. IMO.
This User Gave Thanks to jim mcnamara For This Post:
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Within a ksh script on HP-UX I trying to calculate a percentage of a number (number/100 x percentage) using the below method and expr.
TARPERC=`expr 16 / 100 \* 5`
TARSUM=`expr 16 + $TARPERC`
ZIPSUM=`expr $TARSUM \* 2`
If the input is 16
outputs are:
TARPERC: 0
TARSUM: 16
ZIPSUM: 32... (6 Replies)
Discussion started by: wurzul
6 Replies
2. Shell Programming and Scripting
hello,
i 'd like your help about a bash script which:
1. finds inside the html file (it is attached with my post) the code number of the Latest Stable Kernel,
2.finds the link which leads to the download location of the Latest Stable Kernel version,
(the right link should lead to the file... (3 Replies)
Discussion started by: alex83
3 Replies
3. Shell Programming and Scripting
Hi guys, I hope you can enlight me with a script I'm doing for Solaris 10.
Script goes like this:
#!/usr/bin/bash
fechahoy=`perl /export/home/info/John/fechamod.pl`
fechayer=`perl /export/home/info/John/fecha.pl`
echo $fechahoy
echo $fechayer
DAT1=`ssh ivt@blahblah ls -la... (1 Reply)
Discussion started by: sr00t
1 Replies
4. Shell Programming and Scripting
so i'm have been stifled here inn my attempts at this.
i need to calculate an unusual figure.
what is the percentage difference between 400 and 3?
usually, to get the percentage, you just divide the smaller number by the bigger number. then multiply the answer by 100.
in this case... (10 Replies)
Discussion started by: SkySmart
10 Replies
5. Shell Programming and Scripting
Hi,
I have a table like this,
Group type L1 L2 L3 L4 L5 L6
A xx1 0 3 3 2 1 0
A xx2 2 2 2 1 7 2
B yy1 2 4 6 6 3 1
C yy2 7 7 7 0 2 3
C zz2 8 8 2 ... (6 Replies)
Discussion started by: polsum
6 Replies
6. Shell Programming and Scripting
Hi ,
I need to count the number of errors associated with the two words occurring in the file. It's about counting the occurrences of the word "error" for where is the word "index.js". As such the command should look like. Please kindly help. I was trying: grep "error" log.txt | wc -l (1 Reply)
Discussion started by: jmarx
1 Replies
7. Shell Programming and Scripting
Hi
I have strings like these :
Vengeance mitt
Men Vengeance gloves
Women Quatro Windstopper Etip gloves
Quatro Windstopper Etip gloves
Girls Thermobite hooded jacket
Thermobite Triclimate snow jacket
Boys Thermobite Triclimate snow jacket
and I would like to get the lower case words at... (2 Replies)
Discussion started by: louisJ
2 Replies
8. UNIX for Dummies Questions & Answers
Hello,
I would like to change my setting in a file to the setting that user input.
For example, by default it is
ONBOOT=ON
When user key in "YES", it would be
ONBOOT=YES
--------------
This code only adds in the entire user input, but didn't replace it.
How do i go about... (5 Replies)
Discussion started by: malfolozy
5 Replies
9. Shell Programming and Scripting
hi,
i need to replace all words in any quote position and then need to change the words inside the file thousand of raw.
textfile data :
"Ninguno","Confirma","JuicioABC"
"JuicioCOMP","Recurso","JuicioABC"
"JuicioDELL","Nulidad","Nosino"
"Solidade","JuicioEUR","Segundo"
need... (1 Reply)
Discussion started by: benjietambling
1 Replies
10. Shell Programming and Scripting
Hi All,
I need one help to replace particular words in file based on if finds another words in that file .
i.e.
my self is peter@king.
i am staying at north sydney.
we all are peter@king.
How to replace peter to sham if it finds @king in any line of that file.
Please help me... (8 Replies)
Discussion started by: Rajib Podder
8 Replies
LEARN ABOUT CENTOS
string::similarity
Similarity(3) User Contributed Perl Documentation Similarity(3)
NAME
String::Similarity - calculate the similarity of two strings
SYNOPSIS
use String::Similarity;
$similarity = similarity $string1, $string2;
$similarity = similarity $string1, $string2, $limit;
DESCRIPTION
$factor = similarity $string1, $string2, [$limit]
The "similarity"-function calculates the similarity index of its two arguments. A value of 0 means that the strings are entirely
different. A value of 1 means that the strings are identical. Everything else lies between 0 and 1 and describes the amount of
similarity between the strings.
It roughly works by looking at the smallest number of edits to change one string into the other.
You can add an optional argument $limit (default 0) that gives the minimum similarity the two strings must satisfy. "similarity" stops
analyzing the string as soon as the result drops below the given limit, in which case the result will be invalid but lower than the
given $limit. You can use this to speed up the common case of searching for the most similar string from a set by specifing the maximum
similarity found so far.
SEE ALSO
The basic algorithm is described in:
"An O(ND) Difference Algorithm and its Variations", Eugene Myers,
Algorithmica Vol. 1 No. 2, 1986, pp. 251-266;
see especially section 4.2, which describes the variation used below.
The basic algorithm was independently discovered as described in:
"Algorithms for Approximate String Matching", E. Ukkonen,
Information and Control Vol. 64, 1985, pp. 100-118.
AUTHOR
Marc Lehmann <schmorp@schmorp.de>
http://home.schmorp.de/
(the underlying fstrcmp function was taken from gnu diffutils and
modified by Peter Miller <pmiller@agso.gov.au> and Marc Lehmann
<schmorp@schmorp.de>).
perl v5.16.3 2008-11-04 Similarity(3)