Sponsored Content
Full Discussion: Diffing words - percentages
Top Forums Shell Programming and Scripting Diffing words - percentages Post 302751995 by jim mcnamara on Saturday 5th of January 2013 08:18:11 AM
Old 01-05-2013
You want similarity algorithms

Here is a good article explaining one approach (it talks about java):
How to Strike a Match

Levenshtein distance may be the most likely candidate for you:
Levenshtein distance - Wikipedia, the free encyclopedia

Here is perl module wordnet::similarity
WordNet::Similarity - search.cpan.org

You have to download this module and part of the parent module, too. It gives examples. You will have to work out your percentage calculation using results from a module like this one. Or roll your own (article 1 above). I would recommend doing some reading (above) before messing with this. Similairity algorithms can do interesting and sometimes confusing things. IMO.
This User Gave Thanks to jim mcnamara For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

ksh script using expr to calculate percentages

Within a ksh script on HP-UX I trying to calculate a percentage of a number (number/100 x percentage) using the below method and expr. TARPERC=`expr 16 / 100 \* 5` TARSUM=`expr 16 + $TARPERC` ZIPSUM=`expr $TARSUM \* 2` If the input is 16 outputs are: TARPERC: 0 TARSUM: 16 ZIPSUM: 32... (6 Replies)
Discussion started by: wurzul
6 Replies

2. Shell Programming and Scripting

Shell script to find out words, replace them and count words

hello, i 'd like your help about a bash script which: 1. finds inside the html file (it is attached with my post) the code number of the Latest Stable Kernel, 2.finds the link which leads to the download location of the Latest Stable Kernel version, (the right link should lead to the file... (3 Replies)
Discussion started by: alex83
3 Replies

3. Shell Programming and Scripting

Comparing sizes in percentages of 2 files in bash

Hi guys, I hope you can enlight me with a script I'm doing for Solaris 10. Script goes like this: #!/usr/bin/bash fechahoy=`perl /export/home/info/John/fechamod.pl` fechayer=`perl /export/home/info/John/fecha.pl` echo $fechahoy echo $fechayer DAT1=`ssh ivt@blahblah ls -la... (1 Reply)
Discussion started by: sr00t
1 Replies

4. Shell Programming and Scripting

Script to provide percentages?

so i'm have been stifled here inn my attempts at this. i need to calculate an unusual figure. what is the percentage difference between 400 and 3? usually, to get the percentage, you just divide the smaller number by the bigger number. then multiply the answer by 100. in this case... (10 Replies)
Discussion started by: SkySmart
10 Replies

5. Shell Programming and Scripting

Grouping and calculation of percentages

Hi, I have a table like this, Group type L1 L2 L3 L4 L5 L6 A xx1 0 3 3 2 1 0 A xx2 2 2 2 1 7 2 B yy1 2 4 6 6 3 1 C yy2 7 7 7 0 2 3 C zz2 8 8 2 ... (6 Replies)
Discussion started by: polsum
6 Replies

6. Shell Programming and Scripting

How count the number of two words associated with the two words occurring in the file?

Hi , I need to count the number of errors associated with the two words occurring in the file. It's about counting the occurrences of the word "error" for where is the word "index.js". As such the command should look like. Please kindly help. I was trying: grep "error" log.txt | wc -l (1 Reply)
Discussion started by: jmarx
1 Replies

7. Shell Programming and Scripting

Gawk gensub, match capital words and lowercase words

Hi I have strings like these : Vengeance mitt Men Vengeance gloves Women Quatro Windstopper Etip gloves Quatro Windstopper Etip gloves Girls Thermobite hooded jacket Thermobite Triclimate snow jacket Boys Thermobite Triclimate snow jacket and I would like to get the lower case words at... (2 Replies)
Discussion started by: louisJ
2 Replies

8. UNIX for Dummies Questions & Answers

Replace the words in the file to the words that user type?

Hello, I would like to change my setting in a file to the setting that user input. For example, by default it is ONBOOT=ON When user key in "YES", it would be ONBOOT=YES -------------- This code only adds in the entire user input, but didn't replace it. How do i go about... (5 Replies)
Discussion started by: malfolozy
5 Replies

9. Shell Programming and Scripting

Search words in any quote position and then change the words

hi, i need to replace all words in any quote position and then need to change the words inside the file thousand of raw. textfile data : "Ninguno","Confirma","JuicioABC" "JuicioCOMP","Recurso","JuicioABC" "JuicioDELL","Nulidad","Nosino" "Solidade","JuicioEUR","Segundo" need... (1 Reply)
Discussion started by: benjietambling
1 Replies

10. Shell Programming and Scripting

Replace particular words in file based on if finds another words in that line

Hi All, I need one help to replace particular words in file based on if finds another words in that file . i.e. my self is peter@king. i am staying at north sydney. we all are peter@king. How to replace peter to sham if it finds @king in any line of that file. Please help me... (8 Replies)
Discussion started by: Rajib Podder
8 Replies
Similarity(3)						User Contributed Perl Documentation					     Similarity(3)

NAME
String::Similarity - calculate the similarity of two strings SYNOPSIS
use String::Similarity; $similarity = similarity $string1, $string2; $similarity = similarity $string1, $string2, $limit; DESCRIPTION
$factor = similarity $string1, $string2, [$limit] The "similarity"-function calculates the similarity index of its two arguments. A value of 0 means that the strings are entirely different. A value of 1 means that the strings are identical. Everything else lies between 0 and 1 and describes the amount of similarity between the strings. It roughly works by looking at the smallest number of edits to change one string into the other. You can add an optional argument $limit (default 0) that gives the minimum similarity the two strings must satisfy. "similarity" stops analyzing the string as soon as the result drops below the given limit, in which case the result will be invalid but lower than the given $limit. You can use this to speed up the common case of searching for the most similar string from a set by specifing the maximum similarity found so far. SEE ALSO
The basic algorithm is described in: "An O(ND) Difference Algorithm and its Variations", Eugene Myers, Algorithmica Vol. 1 No. 2, 1986, pp. 251-266; see especially section 4.2, which describes the variation used below. The basic algorithm was independently discovered as described in: "Algorithms for Approximate String Matching", E. Ukkonen, Information and Control Vol. 64, 1985, pp. 100-118. AUTHOR
Marc Lehmann <schmorp@schmorp.de> http://home.schmorp.de/ (the underlying fstrcmp function was taken from gnu diffutils and modified by Peter Miller <pmiller@agso.gov.au> and Marc Lehmann <schmorp@schmorp.de>). perl v5.16.3 2008-11-04 Similarity(3)
All times are GMT -4. The time now is 08:53 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy