Best match query


 
Thread Tools Search this Thread
Top Forums Programming Best match query
# 1  
Old 09-10-2016
Best match query

Hi Forum,

I have the following table, around 60k records .. table name is "destinations"

Code:
BG,United Kingdom Mobile,004474572,0.20441,O 
BG,United Kingdom Mobile,004479,0.20441,O 
BG,United Kingdom Mobile,004479932,0.2222,O 
BG,United Kingdom Mobile,00447993,0.20,O 
BG,United Kingdom Mobile,004477,0.20441,O 
BG,United Kingdom Mobile,00447620,0.20441,O 
BG,United Kingdom Mobile,0044758,0.20441,O 
BG,United Kingdom Mobile,004475897,0.20441,O 
. 
. 
.


select * from destinations where prefix like '0044799326542%'
will not get any results since 0044799326542 does not exist,

what Im trying to achieve is,

if 0044799326542 is not found, try 004479932654,
select * from destinations where prefix like '004479932654%' -- no match

if no match, try 00447993265
select * from destinations where prefix like '00447993265%' -- no match


if no match, try 0044799326
select * from destinations where prefix like '0044799326%' -- no match


if no match, try 004479932
select * from destinations where prefix like '004479932%' -- match
BG,United Kingdom Mobile,004479932,0.2222,O

how to achieve this without having to go through all iterations ?

Thanks

Last edited by Scrutinizer; 09-10-2016 at 04:17 PM.. Reason: quote tags -> code tags
# 2  
Old 09-10-2016
Hi,

maybe you create a function which figures out the count of equal numbers from the left. Let it for example be called: "get_equal_numbers(col_number,my_number)" See here fore documentation: CREATE FUNCTION examples

Then you may create a query like this:

Code:
SELECT col_number,rate,get_equal_numbers(col_number,my_number) AS equals 
FROM destinations
WHERE equals >= 1 
ORDER BY equals DESC

I'm not sure if this performes well.

The choice I normally would take is too read the whole table into an array and use a custom search function, which does the same as above.

Both solutions are not that efficient - especially when you have a lot of lookups to accomplish.

For speed and improving search effiency, I probably would build a hash like this(example in ruby-code):

Code:
#!/usr/bin/ruby

def get_routing_target(number)
   # number is a string
   routing_table = [
     "001" => "data1",
     "0012 => "data2",
     "00129" => "data3" ]
   for current_length in number.length .. 1
        if routing_table[number[0,current_length]] then
            return routing_table[number[0,current_length]
        end
   end
end

The last idea does not allow to have multiple routing targets for the same number. So this needs a bit adjustment to enable that("data" may be implemented as an array of multiple routing targets).

In general I think this may be a performance critical issue, which may cause very high load - depending on the number of requests that will be processed - on your server(DB or application server) if not designed well. So maybe it is better to have a program running as daemon, keeping the complete routing table in memory and reloading from time to time instead of reading it for every request.

UPDATE-1:

I tried it and created some test data(100k Records, with index on the number) and a mysql function:

Code:
DROP FUNCTION IF EXISTS equal_chars_from_left;
DELIMITER $$
CREATE FUNCTION equal_chars_from_left(wanted TEXT, current_field TEXT)
  RETURNS INT
BEGIN
  DECLARE current_length INT;
  SET current_length = LENGTH(wanted);

  count_loop: LOOP
    IF SUBSTRING(current_field,1,current_length) = SUBSTRING(wanted,1,current_length) THEN
        LEAVE count_loop;
    END IF;
    SET current_length = current_length - 1;

    IF current_length = 0 THEN
      LEAVE count_loop;
    END IF;
    
  END LOOP;
  RETURN current_length;
END;
$$
DELIMITER ;

The performance is as bad as expected.

Code:
mysql> select destination,rate,equal_chars_from_left("1212313",destination) as equals from test order by equals desc limit 10;
+----------------+------+--------+
| destination    | rate | equals |
+----------------+------+--------+
| 1212313893101  | 0.97 |      7 |
| 12123105591    | 0.09 |      6 |
| 12123          | 0.98 |      5 |
| 12123235501917 | 0.07 |      5 |
| 121283339945   | 0.19 |      4 |
| 121243075      | 0.53 |      4 |
| 121251         | 0.69 |      4 |
| 1212626040     | 0.23 |      4 |
| 1212914454156  | 0.72 |      4 |
| 121251042914   | 0.46 |      4 |
+----------------+------+--------+
10 rows in set (6,35 sec)

UPDATE 2:

This is a bit faster:

Code:
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '1212313%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '121231%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '12123%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '1212%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '121%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '12%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '1%' 
ORDER BY equals DESC LIMIT 10;

Code:
+----------------+--------+
| destination    | equals |
+----------------+--------+
| 1212313893101  |      7 |
| 12123105591    |      6 |
| 12123          |      5 |
| 12123235501917 |      5 |
| 12120247005    |      4 |
| 121283339945   |      4 |
| 12125241424986 |      4 |
| 1212426494209  |      4 |
| 1212116831     |      4 |
| 121284268      |      4 |
+----------------+--------+
10 rows in set (2,10 sec)


Last edited by stomp; 09-10-2016 at 05:36 PM..
This User Gave Thanks to stomp For This Post:
# 3  
Old 09-10-2016
Big thankss....
# 4  
Old 09-10-2016
Hmm. UNION is unnecessarily complicated. A simple OR does it too.

Here 's some PHP-code which takes the part of the MySQL-function, and is a lot faster:

PHP Code:
$wanted=1212313;

function 
equal_chars_from_left($wanted,$current) {

        
$len_w=strlen($wanted);
        
$len_c=strlen($current);
        
$max_len=($len_c<$len_w)?$len_c:$len_w;
        for(
$charcount=$max_len;$charcount>0;$charcount--) {
                if(
substr($wanted,0,$charcount)==substr($current,0,$charcount)) {
                        return 
$charcount;
                }
        }
        return 
0;
}

function 
sort_by_equals($a$b)
{
        return (
                (
$a[1]==$b[1])  ? 
                                
: ( ( $a[1] > $b[1] ) ? -)
               );
}

/*
        SELECT destination FROM test
        WHERE (
                destination LIKE '1%'           OR
                destination LIKE '12%'          OR
                destination LIKE '121%'         OR
                destination LIKE '1212%'        OR
                destination LIKE '12123%'       OR
                destination LIKE '121231%'      OR
                destination LIKE '1212313%' 
         );

*/

function create_query ($wanted) {
        
$query "
        SELECT destination FROM test
        WHERE (
        "
;
        
$len strlen($wanted);
        for(
$i=1;$i<=$len;$i++) {
                
$or $i<$len "OR" "";
                
$query.="destination LIKE '".substr($wanted,0,$i)."%' $or\n";
        }
        
$query.=" );\n";
        return 
$query;
}


$mysqli = new mysqli("localhost""root""blub123""test");
$res    $mysqli->query(create_query($wanted));

$data = [];
while (
$row $res->fetch_assoc()) {
        
array_push($data,[ $row["destination"], equal_chars_from_left($wanted,$row["destination"])]);
}
usort($data,"sort_by_equals");

for(
$i=0;$i<10;$i++) { printf("%-15s %2d\n"$data[$i][0],$data[$i][1]); } 
Code:
1212313893101    7
12123105591      6
12123            5
12123235501917   5
121216           4
121251042914     4
12129164         4
1212914454156    4
121291002        4
121284268        4

real    0m0.269s
user    0m0.208s
sys     0m0.016s

UPDATE

PHP-Variant b) is even faster

PHP Code:
<?PHP

$wanted
=$argv[1];

function 
equal_chars_from_left($wanted,$current) {

        
$len_w=strlen($wanted);
        
$len_c=strlen($current);
        
$max_len=($len_c<$len_w)?$len_c:$len_w;
        for(
$charcount=$max_len;$charcount>0;$charcount--) {
                if(
substr($wanted,0,$charcount)==substr($current,0,$charcount)) {
                        return 
$charcount;
                }
        }
        return 
0;
}

function 
create_query ($wanted) {
        
$query "
        SELECT destination FROM test
        WHERE (
        "
;
        
$len strlen($wanted);
        for(
$i=1;$i<=$len;$i++) {
                
$or $i<$len "OR" "";
                
$query.="destination LIKE '".substr($wanted,0,$i)."%' $or\n";
        }
        
$query.=" );\n";
        return 
$query;
}

$mysqli = new mysqli("localhost""root""blub123""test");
$res    $mysqli->query(create_query($wanted));

$data = [];
for(
$i=1;$i<=20;$i++) { $data["$i"]=[]; }
while (
$row $res->fetch_assoc()) {
        
$eq equal_chars_from_left($wanted,$row["destination"]);
        if(
$eq>0) {
        
array_push($data["$eq"],$row["destination"]);
        }
}

for(
$z=0,$i=20;$i>0;$i--) {
        for(
$j=0;$j<count($data[$i]);$j++,$z++) {
                if(
$z>9) break(2);
                
printf("%-20s %2d\n",$data[$i][$j],$i);
        }
}
?>
Code:
1212313893101         7
12123105591           6
12123                 5
12123235501917        5
12120247005           4
1212116831            4
1212120056008         4
121216                4
1212232               4
121242                4

real    0m0.132s
user    0m0.120s
sys     0m0.008s


Last edited by stomp; 09-11-2016 at 12:15 PM..
# 5  
Old 09-11-2016
I tried the below precedure... but keep getting errors...


Code:
delimiter //

CREATE FUNCTION BestMatch (msisdn INT)
  RETURNS FLOAT(9,3)

BEGIN

    DECLARE var INT(6);
    DECLARE s FLOAT(9,3);

    DECLARE msisdn_substr3 VARCHAR(16);
    DECLARE msisdn_substr4 VARCHAR(16);
    DECLARE msisdn_substr5 VARCHAR(16);
    DECLARE msisdn_substr6 VARCHAR(16);
    DECLARE msisdn_substr7 VARCHAR(16);
    DECLARE msisdn_substr8 VARCHAR(16);
    DECLARE msisdn_substr9 VARCHAR(16);
    DECLARE msisdn_substr10 VARCHAR(16);
    DECLARE msisdn_substr11 VARCHAR(16);
    DECLARE msisdn_substr12 VARCHAR(16);
    DECLARE msisdn_substr13 VARCHAR(16);
    DECLARE msisdn_substr14 VARCHAR(16);

SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr8');

IF var = 1 THEN
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr8');
END IF ;

IF var > 1 THEN
    SET msisdn_substr9 = SUBSTRING ( msisdn ,1 , 9 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr9');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr9');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr8');
    end if;

    SET msisdn_substr10 = SUBSTRING ( msisdn ,1 , 10 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr10');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr10');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr9');
    end if;

    SET msisdn_substr11 = SUBSTRING ( msisdn ,1 , 11 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr11');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr11');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr10');
    end if;

    SET msisdn_substr12 = SUBSTRING ( msisdn ,1 , 12 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr12');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr12');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr11');
    end if;

    SET msisdn_substr13 = SUBSTRING ( msisdn ,1 , 13 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr13');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr13');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr12');
    end if;

    SET msisdn_substr14 = SUBSTRING ( msisdn ,1 , 14 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr14');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr14');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr13');
    end if;

END IF;

IF var = 0 THEN
    SET msisdn_substr7 = SUBSTRING ( msisdn ,1 , 7 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr7');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr10');
    end if;

    SET msisdn_substr6 = SUBSTRING ( msisdn ,1 , 6 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr6');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr6');
    end if;

    SET msisdn_substr5 = SUBSTRING ( msisdn ,1 , 5 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr5');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr5');
    end if;

    SET msisdn_substr4 = SUBSTRING ( msisdn ,1 , 4 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr4');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr4');
    end if;

    SET msisdn_substr3 = SUBSTRING ( msisdn ,1 , 3 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr3');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr3');
    end if;

END IF;

    RETURN s;

  END //

DELIMITER ;


can you help ?

one IF condition is met, I want to set the variable s and then leave procedure..
how can i exit the procedure once s varable is set ? IF condition is met
Thanks


Moderator's Comments:
Mod Comment Please use CODE tags correctly as required by forum rules!

Last edited by yahyaaa; 09-11-2016 at 07:57 AM.. Reason: corrected CODE tags.
# 6  
Old 09-11-2016
Some comments:
  • If you try to write MySQL-Functions, I suggest starting from a very minimal function and extend bit by bit and checking if the results are as expected at every step. So you can catch up for single syntax/programming errors which is easier than to be faced with all of them at once
  • The mysql function syntax seems to me a bit limited in comparison with every real programming language, so I myself would never write anything but really needed snippets(if at all) as mysql functions.
  • In did not review your code in whole. But as you saw from my small 20 Line MySQL-Function: It already used 6 Seconds to complete, saying which huge amount of computing resources it needed. E. g. it was a very inefficient way to get it done. When I look at your code, you perform a lot of queries which should be all be very fast though. But I just wanted to note that some queries with big data sets can very quickly create huge loads of computing resources needed to get it done.
  • In Backup MySQL-Functions are not saved by default with the mysqldump command. You need to keep that in mind and
    1. change your backup procedure, to have them included -or-
    2. tell your system administrator to have that done in case you are not responsible for backup -or-
    3. save your functions as textfiles outside the mysql database to be sure to have them backupped up


...and since my previous snippets where a whole lot more than you wanted, this search can be done very quick here.

UPDATE

Mmh. I detected that my solutions were not exactly what you wanted. But it should help nevertheless.

Last edited by stomp; 09-11-2016 at 12:54 PM..
This User Gave Thanks to stomp For This Post:
# 7  
Old 09-18-2016
The solution as a mysql function.

GOOD Very fast execution
BAD Debugging is far more difficult
BAD Programming Environment is inferior and limited compared to every other Programming language(including bash)
CAUTION Be sure to have your backups included user defined functions

Code:
DELIMITER $$

DROP FUNCTION IF EXISTS get_prefix;
DROP FUNCTION IF EXISTS get_rate;

/*

Usage from the os shell:

 echo "select get_prefix('$YOUR_NUMBER'),get_rate(get_prefix('$YOUR_NUMBER'));"  \
      | mysql -N -uMYSQL_USER -pMYSQL_PASSWORD DATABASE_NAME 

*/

CREATE FUNCTION get_prefix(wanted TEXT)
  RETURNS TEXT
BEGIN
  DECLARE current_length   INT;
  DECLARE prefix           TEXT;
  SET current_length       = LENGTH(wanted);

  count_loop: LOOP

    SET prefix = (
    SELECT destination
    FROM test 
    WHERE destination = SUBSTRING(wanted,1,current_length) 
    LIMIT 1);

    IF LENGTH(prefix) > 0 THEN LEAVE count_loop; END IF;
    SET current_length = current_length - 1;

    IF current_length = 0 THEN LEAVE count_loop; END IF;
  END LOOP;
  RETURN prefix;
    
END;

CREATE FUNCTION get_rate(prefix TEXT)
  RETURNS TEXT
BEGIN
  RETURN (SELECT rate FROM test WHERE destination = prefix LIMIT 1);
END;
$$
DELIMITER ;

The solution without a mysql function in bash:

GOOD
Easy to write, Easy to Debug
GOOD Full Featured Bash Programming Environent
BAD A little slower than the mysql solution(~50-100ms here)

Code:
#!/bin/bash

# Usage: thisprog.sh <YOURPHONENUMBER>

target=$1 

function mysql_exec { 
    mysql -uroot -pblub123 --skip-column-names test "${@}"; }

for((c=${#target};c>0;c--)) ; do
    prefix=$(mysql_exec -e \
        "SELECT destination FROM test WHERE destination = '${target:0:$c}' LIMIT 1")
    [ -n "$prefix" ] && break
done

rate=$(mysql_exec -e "SELECT rate FROM test WHERE destination = '$prefix' LIMIT 1")
echo "$target $prefix $rate"


Last edited by stomp; 10-03-2016 at 09:29 AM.. Reason: Added Advantages/Disadvantages to every script
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies

2. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies

3. Shell Programming and Scripting

awk to update file based on partial match in field1 and exact match in field2

I am trying to create a cronjob that will run on startup that will look at a list.txt file to see if there is a later version of a database using database.txt as the source. The matching lines are written to output. $1 in database.txt will be in list.txt as a partial match. $2 of database.txt... (2 Replies)
Discussion started by: cmccabe
2 Replies

4. Shell Programming and Scripting

Request: How to Parse dynamic SQL query to pad extra columns to match the fixed number of columns

Hello All, I have a requirement in which i will be given a sql query as input in a file with dynamic number of columns. For example some times i will get 5 columns, some times 8 columns etc up to 20 columns. So my requirement is to generate a output query which will have 20 columns all the... (7 Replies)
Discussion started by: vikas_trl
7 Replies

5. Shell Programming and Scripting

Match pattern1 in file, match pattern2, substitute value1 in line

not getting anywhere with this an xml file contains multiple clients set up with same tags, different values. I need to parse the file for client foo, and change the value of tag "64bit" from false to true. cat clients.xml <Client type"FIX"> <ClientName>foo</ClientName>... (3 Replies)
Discussion started by: jack.bauer
3 Replies

6. Shell Programming and Scripting

Shell Script to execute Oracle query taking input from a file to form query

Hi, I need to query Oracle database for 100 users. I have these 100 users in a file. I need a shell script which would read this User file (one user at a time) & query database. For instance: USER CITY --------- ---------- A CITY_A B CITY_B C ... (2 Replies)
Discussion started by: DevendraG
2 Replies

7. Programming

sql query to match condition from other table (time sensitive)

I know little SQL and could really use a hand here. I need to get the fields last_name, first_name and email from the table users and only if the entery with the same pkey in the table addresses, in the city column is Seattle. So if the city in the addresses table is Seattle, go to the table... (2 Replies)
Discussion started by: computethis
2 Replies

8. Shell Programming and Scripting

Query Oracle tables and return values to shell script that calls the query

Hi, I have a requirement as below which needs to be done viz UNIX shell script (1) I have to connect to an Oracle database (2) Exexute "SELECT field_status from table 1" query on one of the tables. (3) Based on the result that I get from point (2), I have to update another table in the... (6 Replies)
Discussion started by: balaeswari
6 Replies

9. UNIX and Linux Applications

mysql query all entries which 'dont' match

I am trying to query a list of hosts and extract all entries which 'dont' match. SELECT LOGS.host, GOODLIST.host FROM LOGS,db.GOODLIST WHERE (LOGS.host <> GOODLIST.host)When I use this query, it is very very slow. Matching the host with the GOODLIST.host works great and fast but when I use <>... (1 Reply)
Discussion started by: hazno
1 Replies

10. Shell Programming and Scripting

add the output of a query to a variable to be used in another query

I would like to use the result of a query in another query. How do I redirect/add the output to another variable? $result = odbc_exec($connect, $query); while ($row = odbc_fetch_array($result)) { echo $row,"\n"; } odbc_close($connect); ?> This will output hostnames: host1... (0 Replies)
Discussion started by: hazno
0 Replies
Login or Register to Ask a Question