Unix/Linux Go Back    


Programming Post questions about C, C++, Java, SQL, and other programming languages here.

Best match query

Programming


Reply    
 
Thread Tools Search this Thread Display Modes
    #1  
Old Unix and Linux 09-10-2016
yahyaaa's Unix or Linux Image
yahyaaa yahyaaa is offline
Registered User
 
Join Date: May 2008
Last Activity: 2 October 2016, 7:44 AM EDT
Location: Jordan
Posts: 112
Thanks: 6
Thanked 0 Times in 0 Posts
Best match query

Hi Forum,

I have the following table, around 60k records .. table name is "destinations"


Code:
BG,United Kingdom Mobile,004474572,0.20441,O 
BG,United Kingdom Mobile,004479,0.20441,O 
BG,United Kingdom Mobile,004479932,0.2222,O 
BG,United Kingdom Mobile,00447993,0.20,O 
BG,United Kingdom Mobile,004477,0.20441,O 
BG,United Kingdom Mobile,00447620,0.20441,O 
BG,United Kingdom Mobile,0044758,0.20441,O 
BG,United Kingdom Mobile,004475897,0.20441,O 
. 
. 
.


select * from destinations where prefix like '0044799326542%'
will not get any results since 0044799326542 does not exist,

what Im trying to achieve is,

if 0044799326542 is not found, try 004479932654,
select * from destinations where prefix like '004479932654%' -- no match

if no match, try 00447993265
select * from destinations where prefix like '00447993265%' -- no match


if no match, try 0044799326
select * from destinations where prefix like '0044799326%' -- no match


if no match, try 004479932
select * from destinations where prefix like '004479932%' -- match
BG,United Kingdom Mobile,004479932,0.2222,O

how to achieve this without having to go through all iterations ?

Thanks

Last edited by Scrutinizer; 09-10-2016 at 04:17 PM.. Reason: quote tags -> code tags
Sponsored Links
    #2  
Old Unix and Linux 09-10-2016
stomp stomp is offline
Registered User
 
Join Date: May 2016
Last Activity: 24 February 2017, 10:44 AM EST
Posts: 178
Thanks: 9
Thanked 63 Times in 58 Posts
Hi,

maybe you create a function which figures out the count of equal numbers from the left. Let it for example be called: "get_equal_numbers(col_number,my_number)" See here fore documentation: CREATE FUNCTION examples

Then you may create a query like this:


Code:
SELECT col_number,rate,get_equal_numbers(col_number,my_number) AS equals 
FROM destinations
WHERE equals >= 1 
ORDER BY equals DESC

I'm not sure if this performes well.

The choice I normally would take is too read the whole table into an array and use a custom search function, which does the same as above.

Both solutions are not that efficient - especially when you have a lot of lookups to accomplish.

For speed and improving search effiency, I probably would build a hash like this(example in ruby-code):


Code:
#!/usr/bin/ruby

def get_routing_target(number)
   # number is a string
   routing_table = [
     "001" => "data1",
     "0012 => "data2",
     "00129" => "data3" ]
   for current_length in number.length .. 1
        if routing_table[number[0,current_length]] then
            return routing_table[number[0,current_length]
        end
   end
end

The last idea does not allow to have multiple routing targets for the same number. So this needs a bit adjustment to enable that("data" may be implemented as an array of multiple routing targets).

In general I think this may be a performance critical issue, which may cause very high load - depending on the number of requests that will be processed - on your server(DB or application server) if not designed well. So maybe it is better to have a program running as daemon, keeping the complete routing table in memory and reloading from time to time instead of reading it for every request.

UPDATE-1:

I tried it and created some test data(100k Records, with index on the number) and a mysql function:


Code:
DROP FUNCTION IF EXISTS equal_chars_from_left;
DELIMITER $$
CREATE FUNCTION equal_chars_from_left(wanted TEXT, current_field TEXT)
  RETURNS INT
BEGIN
  DECLARE current_length INT;
  SET current_length = LENGTH(wanted);

  count_loop: LOOP
    IF SUBSTRING(current_field,1,current_length) = SUBSTRING(wanted,1,current_length) THEN
        LEAVE count_loop;
    END IF;
    SET current_length = current_length - 1;

    IF current_length = 0 THEN
      LEAVE count_loop;
    END IF;
    
  END LOOP;
  RETURN current_length;
END;
$$
DELIMITER ;

The performance is as bad as expected.


Code:
mysql> select destination,rate,equal_chars_from_left("1212313",destination) as equals from test order by equals desc limit 10;
+----------------+------+--------+
| destination    | rate | equals |
+----------------+------+--------+
| 1212313893101  | 0.97 |      7 |
| 12123105591    | 0.09 |      6 |
| 12123          | 0.98 |      5 |
| 12123235501917 | 0.07 |      5 |
| 121283339945   | 0.19 |      4 |
| 121243075      | 0.53 |      4 |
| 121251         | 0.69 |      4 |
| 1212626040     | 0.23 |      4 |
| 1212914454156  | 0.72 |      4 |
| 121251042914   | 0.46 |      4 |
+----------------+------+--------+
10 rows in set (6,35 sec)

UPDATE 2:

This is a bit faster:


Code:
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '1212313%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '121231%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '12123%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '1212%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '121%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '12%'  UNION
SELECT destination,equal_chars_from_left('1212313',destination) as equals from test where destination like '1%' 
ORDER BY equals DESC LIMIT 10;


Code:
+----------------+--------+
| destination    | equals |
+----------------+--------+
| 1212313893101  |      7 |
| 12123105591    |      6 |
| 12123          |      5 |
| 12123235501917 |      5 |
| 12120247005    |      4 |
| 121283339945   |      4 |
| 12125241424986 |      4 |
| 1212426494209  |      4 |
| 1212116831     |      4 |
| 121284268      |      4 |
+----------------+--------+
10 rows in set (2,10 sec)


Last edited by stomp; 09-10-2016 at 05:36 PM..
The Following User Says Thank You to stomp For This Useful Post:
yahyaaa (09-10-2016)
Sponsored Links
    #3  
Old Unix and Linux 09-10-2016
yahyaaa's Unix or Linux Image
yahyaaa yahyaaa is offline
Registered User
 
Join Date: May 2008
Last Activity: 2 October 2016, 7:44 AM EDT
Location: Jordan
Posts: 112
Thanks: 6
Thanked 0 Times in 0 Posts
Big thankss....
    #4  
Old Unix and Linux 09-10-2016
stomp stomp is offline
Registered User
 
Join Date: May 2016
Last Activity: 24 February 2017, 10:44 AM EST
Posts: 178
Thanks: 9
Thanked 63 Times in 58 Posts
Hmm. UNION is unnecessarily complicated. A simple OR does it too.

Here 's some PHP-code which takes the part of the MySQL-function, and is a lot faster:

PHP Code:
$wanted=1212313;

function 
equal_chars_from_left($wanted,$current) {

        
$len_w=strlen($wanted);
        
$len_c=strlen($current);
        
$max_len=($len_c<$len_w)?$len_c:$len_w;
        for(
$charcount=$max_len;$charcount>0;$charcount--) {
                if(
substr($wanted,0,$charcount)==substr($current,0,$charcount)) {
                        return 
$charcount;
                }
        }
        return 
0;
}

function 
sort_by_equals($a$b)
{
        return (
                (
$a[1]==$b[1])  ? 
                                
: ( ( $a[1] > $b[1] ) ? -)
               );
}

/*
        SELECT destination FROM test
        WHERE (
                destination LIKE '1%'           OR
                destination LIKE '12%'          OR
                destination LIKE '121%'         OR
                destination LIKE '1212%'        OR
                destination LIKE '12123%'       OR
                destination LIKE '121231%'      OR
                destination LIKE '1212313%' 
         );

*/

function create_query ($wanted) {
        
$query "
        SELECT destination FROM test
        WHERE (
        "
;
        
$len strlen($wanted);
        for(
$i=1;$i<=$len;$i++) {
                
$or $i<$len "OR" "";
                
$query.="destination LIKE '".substr($wanted,0,$i)."%' $or\n";
        }
        
$query.=" );\n";
        return 
$query;
}


$mysqli = new mysqli("localhost""root""blub123""test");
$res    $mysqli->query(create_query($wanted));

$data = [];
while (
$row $res->fetch_assoc()) {
        
array_push($data,[ $row["destination"], equal_chars_from_left($wanted,$row["destination"])]);
}
usort($data,"sort_by_equals");

for(
$i=0;$i<10;$i++) { printf("%-15s %2d\n"$data[$i][0],$data[$i][1]); } 

Code:
1212313893101    7
12123105591      6
12123            5
12123235501917   5
121216           4
121251042914     4
12129164         4
1212914454156    4
121291002        4
121284268        4

real    0m0.269s
user    0m0.208s
sys     0m0.016s

UPDATE

PHP-Variant b) is even faster

PHP Code:
<?PHP

$wanted
=$argv[1];

function 
equal_chars_from_left($wanted,$current) {

        
$len_w=strlen($wanted);
        
$len_c=strlen($current);
        
$max_len=($len_c<$len_w)?$len_c:$len_w;
        for(
$charcount=$max_len;$charcount>0;$charcount--) {
                if(
substr($wanted,0,$charcount)==substr($current,0,$charcount)) {
                        return 
$charcount;
                }
        }
        return 
0;
}

function 
create_query ($wanted) {
        
$query "
        SELECT destination FROM test
        WHERE (
        "
;
        
$len strlen($wanted);
        for(
$i=1;$i<=$len;$i++) {
                
$or $i<$len "OR" "";
                
$query.="destination LIKE '".substr($wanted,0,$i)."%' $or\n";
        }
        
$query.=" );\n";
        return 
$query;
}

$mysqli = new mysqli("localhost""root""blub123""test");
$res    $mysqli->query(create_query($wanted));

$data = [];
for(
$i=1;$i<=20;$i++) { $data["$i"]=[]; }
while (
$row $res->fetch_assoc()) {
        
$eq equal_chars_from_left($wanted,$row["destination"]);
        if(
$eq>0) {
        
array_push($data["$eq"],$row["destination"]);
        }
}

for(
$z=0,$i=20;$i>0;$i--) {
        for(
$j=0;$j<count($data[$i]);$j++,$z++) {
                if(
$z>9) break(2);
                
printf("%-20s %2d\n",$data[$i][$j],$i);
        }
}
?>

Code:
1212313893101         7
12123105591           6
12123                 5
12123235501917        5
12120247005           4
1212116831            4
1212120056008         4
121216                4
1212232               4
121242                4

real    0m0.132s
user    0m0.120s
sys     0m0.008s


Last edited by stomp; 09-11-2016 at 12:15 PM..
Sponsored Links
    #5  
Old Unix and Linux 09-11-2016
yahyaaa's Unix or Linux Image
yahyaaa yahyaaa is offline
Registered User
 
Join Date: May 2008
Last Activity: 2 October 2016, 7:44 AM EDT
Location: Jordan
Posts: 112
Thanks: 6
Thanked 0 Times in 0 Posts
I tried the below precedure... but keep getting errors...



Code:
delimiter //

CREATE FUNCTION BestMatch (msisdn INT)
  RETURNS FLOAT(9,3)

BEGIN

    DECLARE var INT(6);
    DECLARE s FLOAT(9,3);

    DECLARE msisdn_substr3 VARCHAR(16);
    DECLARE msisdn_substr4 VARCHAR(16);
    DECLARE msisdn_substr5 VARCHAR(16);
    DECLARE msisdn_substr6 VARCHAR(16);
    DECLARE msisdn_substr7 VARCHAR(16);
    DECLARE msisdn_substr8 VARCHAR(16);
    DECLARE msisdn_substr9 VARCHAR(16);
    DECLARE msisdn_substr10 VARCHAR(16);
    DECLARE msisdn_substr11 VARCHAR(16);
    DECLARE msisdn_substr12 VARCHAR(16);
    DECLARE msisdn_substr13 VARCHAR(16);
    DECLARE msisdn_substr14 VARCHAR(16);

SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr8');

IF var = 1 THEN
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr8');
END IF ;

IF var > 1 THEN
    SET msisdn_substr9 = SUBSTRING ( msisdn ,1 , 9 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr9');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr9');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr8');
    end if;

    SET msisdn_substr10 = SUBSTRING ( msisdn ,1 , 10 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr10');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr10');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr9');
    end if;

    SET msisdn_substr11 = SUBSTRING ( msisdn ,1 , 11 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr11');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr11');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr10');
    end if;

    SET msisdn_substr12 = SUBSTRING ( msisdn ,1 , 12 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr12');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr12');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr11');
    end if;

    SET msisdn_substr13 = SUBSTRING ( msisdn ,1 , 13 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr13');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr13');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr12');
    end if;

    SET msisdn_substr14 = SUBSTRING ( msisdn ,1 , 14 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr14');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr14');
    end if;

    if var = 0 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr13');
    end if;

END IF;

IF var = 0 THEN
    SET msisdn_substr7 = SUBSTRING ( msisdn ,1 , 7 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr7');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr10');
    end if;

    SET msisdn_substr6 = SUBSTRING ( msisdn ,1 , 6 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr6');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr6');
    end if;

    SET msisdn_substr5 = SUBSTRING ( msisdn ,1 , 5 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr5');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr5');
    end if;

    SET msisdn_substr4 = SUBSTRING ( msisdn ,1 , 4 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr4');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr4');
    end if;

    SET msisdn_substr3 = SUBSTRING ( msisdn ,1 , 3 )  ;
    SET var = (SELECT count(*) FROM rates WHERE prefix = 'msisdn_substr3');

    if var = 1 then
    SET s = (SELECT price FROM rates WHERE prefix = 'msisdn_substr3');
    end if;

END IF;

    RETURN s;

  END //

DELIMITER ;


can you help ?

one IF condition is met, I want to set the variable s and then leave procedure..
how can i exit the procedure once s varable is set ? IF condition is met
Thanks


Moderator's Comments:
Best match query Please use CODE tags correctly as required by forum rules!

Last edited by yahyaaa; 09-11-2016 at 07:57 AM.. Reason: corrected CODE tags.
Sponsored Links
    #6  
Old Unix and Linux 09-11-2016
stomp stomp is offline
Registered User
 
Join Date: May 2016
Last Activity: 24 February 2017, 10:44 AM EST
Posts: 178
Thanks: 9
Thanked 63 Times in 58 Posts
Some comments:
  • If you try to write MySQL-Functions, I suggest starting from a very minimal function and extend bit by bit and checking if the results are as expected at every step. So you can catch up for single syntax/programming errors which is easier than to be faced with all of them at once
  • The mysql function syntax seems to me a bit limited in comparison with every real programming language, so I myself would never write anything but really needed snippets(if at all) as mysql functions.
  • In did not review your code in whole. But as you saw from my small 20 Line MySQL-Function: It already used 6 Seconds to complete, saying which huge amount of computing resources it needed. E. g. it was a very inefficient way to get it done. When I look at your code, you perform a lot of queries which should be all be very fast though. But I just wanted to note that some queries with big data sets can very quickly create huge loads of computing resources needed to get it done.
  • In Backup MySQL-Functions are not saved by default with the mysqldump command. You need to keep that in mind and
    1. change your backup procedure, to have them included -or-
    2. tell your system administrator to have that done in case you are not responsible for backup -or-
    3. save your functions as textfiles outside the mysql database to be sure to have them backupped up


...and since my previous snippets where a whole lot more than you wanted, this search can be done very quick here.

UPDATE

Mmh. I detected that my solutions were not exactly what you wanted. But it should help nevertheless.

Last edited by stomp; 09-11-2016 at 12:54 PM..
The Following User Says Thank You to stomp For This Useful Post:
yahyaaa (09-12-2016)
Sponsored Links
    #7  
Old Unix and Linux 09-18-2016
stomp stomp is offline
Registered User
 
Join Date: May 2016
Last Activity: 24 February 2017, 10:44 AM EST
Posts: 178
Thanks: 9
Thanked 63 Times in 58 Posts
The solution as a mysql function.

GOOD Very fast execution
BAD Debugging is far more difficult
BAD Programming Environment is inferior and limited compared to every other Programming language(including bash)
CAUTION Be sure to have your backups included user defined functions


Code:
DELIMITER $$

DROP FUNCTION IF EXISTS get_prefix;
DROP FUNCTION IF EXISTS get_rate;

/*

Usage from the os shell:

 echo "select get_prefix('$YOUR_NUMBER'),get_rate(get_prefix('$YOUR_NUMBER'));"  \
      | mysql -N -uMYSQL_USER -pMYSQL_PASSWORD DATABASE_NAME 

*/

CREATE FUNCTION get_prefix(wanted TEXT)
  RETURNS TEXT
BEGIN
  DECLARE current_length   INT;
  DECLARE prefix           TEXT;
  SET current_length       = LENGTH(wanted);

  count_loop: LOOP

    SET prefix = (
    SELECT destination
    FROM test 
    WHERE destination = SUBSTRING(wanted,1,current_length) 
    LIMIT 1);

    IF LENGTH(prefix) > 0 THEN LEAVE count_loop; END IF;
    SET current_length = current_length - 1;

    IF current_length = 0 THEN LEAVE count_loop; END IF;
  END LOOP;
  RETURN prefix;
    
END;

CREATE FUNCTION get_rate(prefix TEXT)
  RETURNS TEXT
BEGIN
  RETURN (SELECT rate FROM test WHERE destination = prefix LIMIT 1);
END;
$$
DELIMITER ;

The solution without a mysql function in bash:

GOOD
Easy to write, Easy to Debug
GOOD Full Featured Bash Programming Environent
BAD A little slower than the mysql solution(~50-100ms here)


Code:
#!/bin/bash

# Usage: thisprog.sh <YOURPHONENUMBER>

target=$1 

function mysql_exec { 
    mysql -uroot -pblub123 --skip-column-names test "${@}"; }

for((c=${#target};c>0;c--)) ; do
    prefix=$(mysql_exec -e \
        "SELECT destination FROM test WHERE destination = '${target:0:$c}' LIMIT 1")
    [ -n "$prefix" ] && break
done

rate=$(mysql_exec -e "SELECT rate FROM test WHERE destination = '$prefix' LIMIT 1")
echo "$target $prefix $rate"


Last edited by stomp; 10-03-2016 at 09:29 AM.. Reason: Added Advantages/Disadvantages to every script
Sponsored Links
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Request: How to Parse dynamic SQL query to pad extra columns to match the fixed number of columns vikas_trl Shell Programming and Scripting 7 01-22-2015 04:18 PM
Shell Script to execute Oracle query taking input from a file to form query DevendraG Shell Programming and Scripting 2 11-10-2011 06:56 AM
sql query to match condition from other table (time sensitive) computethis Programming 2 08-10-2011 02:35 AM
Query Oracle tables and return values to shell script that calls the query balaeswari Shell Programming and Scripting 6 02-04-2010 09:35 AM
mysql query all entries which 'dont' match hazno UNIX and Linux Applications 1 04-08-2009 12:47 AM



All times are GMT -4. The time now is 12:31 AM.