"SQL: copying data up"

Post #303028291 by durden_tyler on Thursday 3rd of January 2019 02:38:43 PM

Full Discussion: SQL: copying data up
Old 2 Weeks Ago
First of all - thank you very much for providing the CREATE TABLE and INSERT statements. Those really help!
If you have MySQL version < 8.0.2, then you could find the previous level by joining the table with itself.

Let's say for a given date in events table, I have a set of records S that includes that date and all previous dates and having in_event = 1.
Then the maximum date and its level value in the set S is the one I want.
The following query fetches that information.For each date in events table, it fetches the current or previous maximum date with in_event=1.

Code:
mysql>
mysql> select e3.e1_date as date,
    ->        e3.e2_date as prev_date,
    ->        e.level as prev_level
    ->   from events e
    ->        inner join (
    ->                      select e1.date as e1_date,
    ->                             max(e2.date) as e2_date
    ->                        from events e1
    ->                             inner join events e2 on (e2.date <= e1.date and e2.in_event=1)
    ->                       group by e1.date
    ->                   ) e3
    ->        on (e.date = e3.e2_date)
    -> ;
+------------+------------+------------+
| date       | prev_date  | prev_level |
+------------+------------+------------+
| 2018-09-04 | 2018-09-03 |       1571 |
| 2018-09-03 | 2018-09-03 |       1571 |
| 2018-08-31 | 2018-08-21 |       1559 |
| 2018-08-30 | 2018-08-21 |       1559 |
| 2018-08-29 | 2018-08-21 |       1559 |
| 2018-08-28 | 2018-08-21 |       1559 |
| 2018-08-24 | 2018-08-21 |       1559 |
| 2018-08-23 | 2018-08-21 |       1559 |
| 2018-08-22 | 2018-08-21 |       1559 |
| 2018-08-21 | 2018-08-21 |       1559 |
| 2018-08-20 | 2018-08-07 |     1736.2 |
| 2018-08-17 | 2018-08-07 |     1736.2 |
| 2018-08-16 | 2018-08-07 |     1736.2 |
| 2018-08-15 | 2018-08-07 |     1736.2 |
| 2018-08-14 | 2018-08-07 |     1736.2 |
| 2018-08-13 | 2018-08-07 |     1736.2 |
| 2018-08-10 | 2018-08-07 |     1736.2 |
| 2018-08-09 | 2018-08-07 |     1736.2 |
| 2018-08-08 | 2018-08-07 |     1736.2 |
| 2018-08-07 | 2018-08-07 |     1736.2 |
| 2018-08-06 | 2018-07-23 |     1604.6 |
| 2018-08-03 | 2018-07-23 |     1604.6 |
| 2018-08-02 | 2018-07-23 |     1604.6 |
| 2018-08-01 | 2018-07-23 |     1604.6 |
| 2018-07-31 | 2018-07-23 |     1604.6 |
| 2018-07-30 | 2018-07-23 |     1604.6 |
| 2018-07-27 | 2018-07-23 |     1604.6 |
| 2018-07-26 | 2018-07-23 |     1604.6 |
| 2018-07-25 | 2018-07-23 |     1604.6 |
| 2018-07-24 | 2018-07-23 |     1604.6 |
| 2018-07-23 | 2018-07-23 |     1604.6 |
| 2018-07-20 | 2018-07-17 |     1666.2 |
| 2018-07-19 | 2018-07-17 |     1666.2 |
| 2018-07-18 | 2018-07-17 |     1666.2 |
| 2018-07-17 | 2018-07-17 |     1666.2 |
+------------+------------+------------+
35 rows in set (0.00 sec)

mysql>
mysql>

Then all you have to do is - plug in this statement in MySQL's UPDATE..JOIN syntax to update the events table.


Code:
mysql>
mysql> update events x
    ->   join (
    ->             select e3.e1_date as date,
    ->                    e3.e2_date as prev_date,
    ->                    e.level as prev_level
    ->               from events e
    ->                    inner join (
    ->                                  select e1.date as e1_date,
    ->                                         max(e2.date) as e2_date
    ->                                    from events e1
    ->                                         inner join events e2
    ->                                         on (e2.date <= e1.date and e2.in_event = 1)
    ->                                   group by e1.date
    ->                               ) e3
    ->                    on (e.date = e3.e2_date)
    ->        ) y
    ->     on (x.date = y.date)
    ->    set x.delta = y.prev_level
    ->  where 1 = 1
    -> ;
Query OK, 35 rows affected, 25 warnings (0.03 sec)
Rows matched: 35  Changed: 35  Warnings: 25

mysql>
mysql>
mysql> select * from events;
+------------+-----------+--------+----------+---------+
| date       | sortorder | level  | in_event | delta   |
+------------+-----------+--------+----------+---------+
| 2018-09-04 |         2 | 1547.8 |        0 | 1571.00 |
| 2018-09-03 |         3 |   1571 |        1 | 1571.00 |
| 2018-08-31 |         4 | 1541.6 |        0 | 1559.00 |
| 2018-08-30 |         5 | 1575.4 |        0 | 1559.00 |
| 2018-08-29 |         6 | 1596.6 |        0 | 1559.00 |
| 2018-08-28 |         7 | 1626.4 |        0 | 1559.00 |
| 2018-08-24 |         8 |   1575 |        0 | 1559.00 |
| 2018-08-23 |         9 |   1536 |        0 | 1559.00 |
| 2018-08-22 |        10 |   1579 |        0 | 1559.00 |
| 2018-08-21 |        11 |   1559 |        1 | 1559.00 |
| 2018-08-20 |        12 |   1554 |        0 | 1736.20 |
| 2018-08-17 |        13 |   1534 |        0 | 1736.20 |
| 2018-08-16 |        14 |   1543 |        0 | 1736.20 |
| 2018-08-15 |        15 | 1542.2 |        0 | 1736.20 |
| 2018-08-14 |        16 | 1646.8 |        0 | 1736.20 |
| 2018-08-13 |        17 | 1680.4 |        0 | 1736.20 |
| 2018-08-10 |        18 | 1667.6 |        0 | 1736.20 |
| 2018-08-09 |        19 | 1711.6 |        0 | 1736.20 |
| 2018-08-08 |        20 | 1726.6 |        0 | 1736.20 |
| 2018-08-07 |        21 | 1736.2 |        1 | 1736.20 |
| 2018-08-06 |        22 |   1678 |        0 | 1604.60 |
| 2018-08-03 |        23 | 1692.8 |        0 | 1604.60 |
| 2018-08-02 |        24 | 1655.4 |        0 | 1604.60 |
| 2018-08-01 |        25 | 1717.4 |        0 | 1604.60 |
| 2018-07-31 |        26 | 1732.8 |        0 | 1604.60 |
| 2018-07-30 |        27 |   1678 |        0 | 1604.60 |
| 2018-07-27 |        28 | 1691.4 |        0 | 1604.60 |
| 2018-07-26 |        29 | 1687.4 |        0 | 1604.60 |
| 2018-07-25 |        30 |   1700 |        0 | 1604.60 |
| 2018-07-24 |        31 |   1695 |        0 | 1604.60 |
| 2018-07-23 |        32 | 1604.6 |        1 | 1604.60 |
| 2018-07-20 |        33 | 1599.8 |        0 | 1666.20 |
| 2018-07-19 |        34 |   1622 |        0 | 1666.20 |
| 2018-07-18 |        35 |   1691 |        0 | 1666.20 |
| 2018-07-17 |        36 | 1666.2 |        1 | 1666.20 |
+------------+-----------+--------+----------+---------+
35 rows in set (0.00 sec)

mysql>
mysql>

Unfortunately, to get all this information, we join and scan the events table 3 times and then one more time to update it.
That is highly inefficient. So the SQL standard came up with "analytic" or "window" functions to reduce the unnecessary table scans.
If you have MySQL version 8.0.2 or higher, then it supports window functions and you could use them to solve your problem.


I am not sure if you know about them, so I'll give a high level view to show how those could be used.
Let's say we divide the data into "batches".
For example:
1) Dates from '2018-09-03' to '2018-09-04' are in one batch.
2) Dates from '2018-08-21' to '2018-08-31' are in another batch.
3) Dates from '2018-08-07' to '2018-08-20' are in yet another batch. And so on...
Then the first date in each batch (the least date) is the one we are looking for.
We identify it and pick up its level and set that as the delta for all records in that batch.


Now to generate "batches", we only need generate a running total. The SUM() analytic function could do that for us.
(Note that this is not the SUM aggregate function that has the GROUP BY clause to go with it. But it is analogous to it.
All aggregate functions like MAX, MIN, COUNT etc. have their corresponding analytic versions.)


Code:
mysql>
mysql> -- MySQL version
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.13    |
+-----------+
1 row in set (0.00 sec)

mysql>
mysql>
mysql> select date, sortorder, level, in_event,
    ->        sum(in_event) over (order by date) as running_total
    ->   from events
    -> ;
+------------+-----------+--------+----------+---------------+
| date       | sortorder | level  | in_event | running_total |
+------------+-----------+--------+----------+---------------+
| 2018-07-17 |        36 | 1666.2 |        1 |             1 |
| 2018-07-18 |        35 |   1691 |        0 |             1 |
| 2018-07-19 |        34 |   1622 |        0 |             1 |
| 2018-07-20 |        33 | 1599.8 |        0 |             1 |
| 2018-07-23 |        32 | 1604.6 |        1 |             2 |
| 2018-07-24 |        31 |   1695 |        0 |             2 |
| 2018-07-25 |        30 |   1700 |        0 |             2 |
| 2018-07-26 |        29 | 1687.4 |        0 |             2 |
| 2018-07-27 |        28 | 1691.4 |        0 |             2 |
| 2018-07-30 |        27 |   1678 |        0 |             2 |
| 2018-07-31 |        26 | 1732.8 |        0 |             2 |
| 2018-08-01 |        25 | 1717.4 |        0 |             2 |
| 2018-08-02 |        24 | 1655.4 |        0 |             2 |
| 2018-08-03 |        23 | 1692.8 |        0 |             2 |
| 2018-08-06 |        22 |   1678 |        0 |             2 |
| 2018-08-07 |        21 | 1736.2 |        1 |             3 |
| 2018-08-08 |        20 | 1726.6 |        0 |             3 |
| 2018-08-09 |        19 | 1711.6 |        0 |             3 |
| 2018-08-10 |        18 | 1667.6 |        0 |             3 |
| 2018-08-13 |        17 | 1680.4 |        0 |             3 |
| 2018-08-14 |        16 | 1646.8 |        0 |             3 |
| 2018-08-15 |        15 | 1542.2 |        0 |             3 |
| 2018-08-16 |        14 |   1543 |        0 |             3 |
| 2018-08-17 |        13 |   1534 |        0 |             3 |
| 2018-08-20 |        12 |   1554 |        0 |             3 |
| 2018-08-21 |        11 |   1559 |        1 |             4 |
| 2018-08-22 |        10 |   1579 |        0 |             4 |
| 2018-08-23 |         9 |   1536 |        0 |             4 |
| 2018-08-24 |         8 |   1575 |        0 |             4 |
| 2018-08-28 |         7 | 1626.4 |        0 |             4 |
| 2018-08-29 |         6 | 1596.6 |        0 |             4 |
| 2018-08-30 |         5 | 1575.4 |        0 |             4 |
| 2018-08-31 |         4 | 1541.6 |        0 |             4 |
| 2018-09-03 |         3 |   1571 |        1 |             5 |
| 2018-09-04 |         2 | 1547.8 |        0 |             5 |
+------------+-----------+--------+----------+---------------+
35 rows in set (0.00 sec)

mysql>
mysql>

Once I have the batches, I find the first date in each batch using the "FIRST_VALUE()" analytic function.
The data has to be divided into batches which is specified by the "partition by" clause.


Code:
mysql>
mysql> select e.date, e.sortorder, e.level, e.in_event,
    ->        first_value(e.level) over (partition by e.batch
    ->                                   order by e.date rows unbounded preceding) as fv
    ->   from (
    ->             select date, sortorder, level, in_event,
    ->                    sum(in_event) over (order by date) as batch
    ->               from events
    ->        ) e
    -> ;
+------------+-----------+--------+----------+--------------------+
| date       | sortorder | level  | in_event | fv                 |
+------------+-----------+--------+----------+--------------------+
| 2018-07-17 |        36 | 1666.2 |        1 |  1666.199951171875 |
| 2018-07-18 |        35 |   1691 |        0 |  1666.199951171875 |
| 2018-07-19 |        34 |   1622 |        0 |  1666.199951171875 |
| 2018-07-20 |        33 | 1599.8 |        0 |  1666.199951171875 |
| 2018-07-23 |        32 | 1604.6 |        1 | 1604.5999755859375 |
| 2018-07-24 |        31 |   1695 |        0 | 1604.5999755859375 |
| 2018-07-25 |        30 |   1700 |        0 | 1604.5999755859375 |
| 2018-07-26 |        29 | 1687.4 |        0 | 1604.5999755859375 |
| 2018-07-27 |        28 | 1691.4 |        0 | 1604.5999755859375 |
| 2018-07-30 |        27 |   1678 |        0 | 1604.5999755859375 |
| 2018-07-31 |        26 | 1732.8 |        0 | 1604.5999755859375 |
| 2018-08-01 |        25 | 1717.4 |        0 | 1604.5999755859375 |
| 2018-08-02 |        24 | 1655.4 |        0 | 1604.5999755859375 |
| 2018-08-03 |        23 | 1692.8 |        0 | 1604.5999755859375 |
| 2018-08-06 |        22 |   1678 |        0 | 1604.5999755859375 |
| 2018-08-07 |        21 | 1736.2 |        1 |  1736.199951171875 |
| 2018-08-08 |        20 | 1726.6 |        0 |  1736.199951171875 |
| 2018-08-09 |        19 | 1711.6 |        0 |  1736.199951171875 |
| 2018-08-10 |        18 | 1667.6 |        0 |  1736.199951171875 |
| 2018-08-13 |        17 | 1680.4 |        0 |  1736.199951171875 |
| 2018-08-14 |        16 | 1646.8 |        0 |  1736.199951171875 |
| 2018-08-15 |        15 | 1542.2 |        0 |  1736.199951171875 |
| 2018-08-16 |        14 |   1543 |        0 |  1736.199951171875 |
| 2018-08-17 |        13 |   1534 |        0 |  1736.199951171875 |
| 2018-08-20 |        12 |   1554 |        0 |  1736.199951171875 |
| 2018-08-21 |        11 |   1559 |        1 |               1559 |
| 2018-08-22 |        10 |   1579 |        0 |               1559 |
| 2018-08-23 |         9 |   1536 |        0 |               1559 |
| 2018-08-24 |         8 |   1575 |        0 |               1559 |
| 2018-08-28 |         7 | 1626.4 |        0 |               1559 |
| 2018-08-29 |         6 | 1596.6 |        0 |               1559 |
| 2018-08-30 |         5 | 1575.4 |        0 |               1559 |
| 2018-08-31 |         4 | 1541.6 |        0 |               1559 |
| 2018-09-03 |         3 |   1571 |        1 |               1571 |
| 2018-09-04 |         2 | 1547.8 |        0 |               1571 |
+------------+-----------+--------+----------+--------------------+
35 rows in set (0.00 sec)

mysql>
mysql>

Now that I have all the necessary information, I plug in this query in the UPDATE..JOIN syntax to update the table.


Code:
mysql>
mysql> update events x
    ->   join (
    ->           select e.date,
    ->                  first_value(e.level) over (partition by e.batch
    ->                                             order by e.date rows unbounded preceding) as fv
    ->             from (
    ->                       select date, sortorder, level, in_event,
    ->                              sum(in_event) over (order by date) as batch
    ->                         from events
    ->                  ) e
    ->        ) y
    ->     on (x.date = y.date)
    ->    set x.delta = y.fv
    ->  where 1 = 1
    -> ;
Query OK, 35 rows affected, 25 warnings (0.04 sec)
Rows matched: 35  Changed: 35  Warnings: 25

mysql>
mysql>
mysql> select * from events;
+------------+-----------+--------+----------+---------+
| date       | sortorder | level  | in_event | delta   |
+------------+-----------+--------+----------+---------+
| 2018-09-04 |         2 | 1547.8 |        0 | 1571.00 |
| 2018-09-03 |         3 |   1571 |        1 | 1571.00 |
| 2018-08-31 |         4 | 1541.6 |        0 | 1559.00 |
| 2018-08-30 |         5 | 1575.4 |        0 | 1559.00 |
| 2018-08-29 |         6 | 1596.6 |        0 | 1559.00 |
| 2018-08-28 |         7 | 1626.4 |        0 | 1559.00 |
| 2018-08-24 |         8 |   1575 |        0 | 1559.00 |
| 2018-08-23 |         9 |   1536 |        0 | 1559.00 |
| 2018-08-22 |        10 |   1579 |        0 | 1559.00 |
| 2018-08-21 |        11 |   1559 |        1 | 1559.00 |
| 2018-08-20 |        12 |   1554 |        0 | 1736.20 |
| 2018-08-17 |        13 |   1534 |        0 | 1736.20 |
| 2018-08-16 |        14 |   1543 |        0 | 1736.20 |
| 2018-08-15 |        15 | 1542.2 |        0 | 1736.20 |
| 2018-08-14 |        16 | 1646.8 |        0 | 1736.20 |
| 2018-08-13 |        17 | 1680.4 |        0 | 1736.20 |
| 2018-08-10 |        18 | 1667.6 |        0 | 1736.20 |
| 2018-08-09 |        19 | 1711.6 |        0 | 1736.20 |
| 2018-08-08 |        20 | 1726.6 |        0 | 1736.20 |
| 2018-08-07 |        21 | 1736.2 |        1 | 1736.20 |
| 2018-08-06 |        22 |   1678 |        0 | 1604.60 |
| 2018-08-03 |        23 | 1692.8 |        0 | 1604.60 |
| 2018-08-02 |        24 | 1655.4 |        0 | 1604.60 |
| 2018-08-01 |        25 | 1717.4 |        0 | 1604.60 |
| 2018-07-31 |        26 | 1732.8 |        0 | 1604.60 |
| 2018-07-30 |        27 |   1678 |        0 | 1604.60 |
| 2018-07-27 |        28 | 1691.4 |        0 | 1604.60 |
| 2018-07-26 |        29 | 1687.4 |        0 | 1604.60 |
| 2018-07-25 |        30 |   1700 |        0 | 1604.60 |
| 2018-07-24 |        31 |   1695 |        0 | 1604.60 |
| 2018-07-23 |        32 | 1604.6 |        1 | 1604.60 |
| 2018-07-20 |        33 | 1599.8 |        0 | 1666.20 |
| 2018-07-19 |        34 |   1622 |        0 | 1666.20 |
| 2018-07-18 |        35 |   1691 |        0 | 1666.20 |
| 2018-07-17 |        36 | 1666.2 |        1 | 1666.20 |
+------------+-----------+--------+----------+---------+
35 rows in set (0.00 sec)

mysql>
mysql>

One last observation: in your workaround solution, if you change "order by date desc" to "order by date" or "order by date asc", then it copies the data in the desired direction.



Hope that helps!
This User Gave Thanks to durden_tyler For This Post:
Neo (1 Week Ago)