关联文章:

  • MySQL对derived table的优化处理与运用约束


一、Bug描述

搭档遇到一个有意思的句子,说一条SQL在MySQL8.0.25版别运转出的成果显着与给定的where条件不符,而在8.0.26版别上是正常的,句子上加了一个无关的用户变量后在8.0.25版别上成果才是正确的,想不通这是怎么回事,这么有意思的事情天然引起了我的兴趣,借此机会深化了解了一下MySQL关于derived table的优化。为了方便演示作用,让小伙伴们关注到现象的实质,我将句子进行了简化处理。

下面是模仿的表结构与数据。

create table t1(id int,c1 varchar(100));
insert into t1 values(1,'gg张三');
insert into t1 values(2,'bb李四');
insert into t1 values(3,'cc王五');
insert into t1 values(4,'dd刘麻子');
insert into t1 values(1,'gg张三');
insert into t1 values(2,'bb李四');

SQL句子:

SELECT temp.type
  FROM (SELECT  SUBSTRING(t.type, 3) type
          FROM (SELECT distinct t1.c1 type
                  FROM test.t1
                 ORDER BY t1.c1) t) temp
 WHERE temp.type='张三'
 ORDER BY temp.type DESC;

在MySQL8.0.25版别的运转成果如下:

mysql> SELECT temp.type
    ->   FROM (SELECT  SUBSTRING(t.type, 3) type
    ->           FROM (SELECT distinct t1.c1 type
    ->                   FROM test.t1
    ->                  ORDER BY t1.c1) t) temp
    ->  WHERE temp.type='张三'
    ->  ORDER BY temp.type DESC;
+--------+
| type   |
+--------+
| 李四   |
+--------+
1 rows in set (0.01 sec)

在MySQL8.0.26版别的运转成果如下:

mysql> SELECT temp.type
    ->   FROM (SELECT  SUBSTRING(t.type, 3) type
    ->           FROM (SELECT distinct t1.c1 type
    ->                   FROM test.t1
    ->                  ORDER BY t1.c1) t) temp
    ->  WHERE temp.type='张三'
    ->  ORDER BY temp.type DESC;
+--------+
| type   |
+--------+
| 张三   |
+--------+
1 rows in set (0.00 sec)

很显着,这个句子在8.0.25版别运转出的成果与咱们给定where条件不符,咱们要查询关于“张三”的记录,成果返回的成果是”李四“的,很显着的一个bug,可是到8.0.26版别这个问题得到了修正。

怀着对各版别对此句子执行情况的好奇,我先是往前追溯,查看了8.0.24,8.0.23,8.0.22,8.0.21,5.7.39版别上做了测验,发现在8.0.24,8.0.23,8.0.22版别成果与8.0.25相同,都是错误成果,而在8.0.21版别上运转成果是正确的,5.7版别上成果也是正确的的。往后追溯,8.0.26,8.0.32版别也都是正确的,因而判别此问题在8.0.22~8.0.25版别上存在此问题。

这个句子最大的特点便是运用了派生表(derived table),MySQL在8.0.22版别上引进了一个关于派生表的优化器开关derived_condition_pushdown, 默认设置为on。咱们看一下关于这个特性在官方文档中的描述:

MySQL 8.0.22 and later supports derived condition pushdown for eligible subqueries. For a query such as SELECT * FROM (SELECT i, j FROM t1) AS dt WHERE i > constant, it is possible in many cases to push the outer WHERE condition down to the derived table, in this case resulting in SELECT * FROM (SELECT i, j FROM t1 WHERE i > constant) AS dt.

在8.0.26版别中修正的bug中发现一个与此问题相关的bug。描述如下:

When a condition is pushed down to a materialized derived table, a clone of the derived table expression replaces the column (from the outer query block) in the condition. When the cloned item included a FULLTEXT function, it was added to the outer query block instead of the derived table query block, which led to problems. To fix this, we now use the derived query block to clone such items. (Bug #32820437)

看到这里咱们能够确定,便是8.0.22版别时这个新特性的引进,导致了此问题的发生,庆幸的是这个问题在8.0.26版别中已得到解决

文章开头说的问题句子跟这个bug的描述是符合的,派生表temp外部的过滤条件 temp.type=’张三' 其实是substring(t.type,3)='张三',应该便是对应bug描述中的”the cloned item included a FULLTEXT function“不管substring函数是不是fulltext函数,总归这个问题伴随着这个bug的修正也修正了。这种句子结构下,许多函数都有这个问题,比如trim,replace等。MySQL内部怎么处理得到的错误成果咱们就不去深究了,可是怎么躲避这个bug咱们需要了解一下。

二、bug躲避

升级到8.0.26及以上的版别问题天然就解决了,假如不想升级也是有许多方法来躲避此问题的。这个bug的发生主要是因为新特性derived_condition_pushdown的引进,封闭此特性,在这几个版别中就不会呈现这个问题。

mysql> set optimizer_switch="derived_condition_pushdown=off";
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT temp.type
    ->   FROM (SELECT  SUBSTRING(t.type, 3) type
    ->           FROM (SELECT distinct t1.c1 type
    ->                   FROM test.t1
    ->                  ORDER BY t1.c1) t) temp
    ->  WHERE temp.type='张三'
    ->  ORDER BY temp.type DESC;
+--------+
| type   |
+--------+
| 张三   |
+--------+
1 rows in set (0.00 sec)

优化器开关里还有一个派生表相关的开关,便是derived_merge,是否进行派生表兼并。封闭这个derived_merge,成果也是正确的。

mysql> set optimizer_switch="derived_merge=off,derived_condition_pushdown=on";
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT temp.type
    ->   FROM (SELECT  SUBSTRING(t.type, 3) type
    ->           FROM (SELECT distinct t1.c1 type
    ->                   FROM test.t1
    ->                  ORDER BY t1.c1) t) temp
    ->  WHERE temp.type='张三'
    ->  ORDER BY temp.type DESC;
+--------+
| type   |
+--------+
| 张三   |
+--------+
1 rows in set (0.00 sec)

也便是说当派生表条件下推撞上派生表兼并时,数据库做的处理不对,导致了问题的发生。

因而只需操控不发生兼并,或许不发生条件下推,就能躲避此bug。除了封闭优化器开关,在句子级别咱们还有许多方法来躲避,下面列举几个。

1.运用NO_MERGE的hint来阻挠derived table兼并。

SELECT  /*+ NO_MERGE(temp) */ temp.type
  FROM (SELECT substring(t.type,3) type
          FROM (SELECT distinct t1.c1 type
                  FROM test.t1
                 ORDER BY t1.c1) t
) temp
 WHERE temp.type='张三'
 ORDER BY temp.type DESC;

2.运用NO_DERIVED_CONDITION_PUSHDOWN的hint阻挠条件下推。

SELECT  temp.type
  FROM (SELECT /*+ NO_DERIVED_CONDITION_PUSHDOWN(t) */ substring(t.type,3) type
          FROM (SELECT  distinct t1.c1 type
                  FROM test.t1
                 ORDER BY t1.c1) t
) temp
 WHERE temp.type='张三'
 ORDER BY temp.type DESC;

3.运用limit子句,能一起阻挠兼并与条件下推。

例如:

SELECT temp.type
  FROM (SELECT substring(t.type,3) type
          FROM (SELECT distinct t1.c1 type
                  FROM test.t1
                 ORDER BY t1.c1) t limit 100000000000) temp
 WHERE temp.type='张三'
 ORDER BY temp.type DESC;

4.分配用户变量,阻挠derived table 兼并。

例如:

SELECT temp.type
  FROM (SELECT (@i:=0) as num, substring(t.type,3) type
          FROM (SELECT distinct t1.c1 type
                  FROM test.t1
                 ORDER BY t1.c1) t) temp
 WHERE temp.type='张三'
 ORDER BY temp.type DESC;

这种方法便是前文说到的,为什么加了一个与业务逻辑无关的用户变量,成果就正确的原因。

5.运用union all来阻挠derived table 兼并

SELECT temp.type
  FROM (SELECT substring(t.type,3) type
          FROM (SELECT distinct t1.c1 type
                  FROM test.t1
                 ORDER BY t1.c1) t
                 union all 
                 select '1') temp
 WHERE temp.type='张三'
 ORDER BY temp.type DESC;

这些方法主要是依据优化器运用hint灵活操控优化器的开关,以及derive_merge与derived_condition_pushdown的运用约束。

三、总结

  1. MySQL8.0.22~MySQL8.0.25 因为优化器新特性derived_condition_pushdown带来的bug,能够经过derived merge与 derived_condition_pushdown 的运用约束以及优化器开关hint来有用躲避bug,当然升级到高版别更好。
  2. 假如想让新特性derived_condition_pushdown发挥作用,就要避开它的运用约束。
  3. 一个新特性的呈现,不可避免会伴随着一些bug的发生,不要对此心存惊骇,只需深化了解它,就能取其长,避其短。

Enjoy GreatSQL :)

关于 GreatSQL

GreatSQL是由万里数据库保护的MySQL分支,专心于提升MGR可靠性及功能,支持InnoDB并行查询特性,是适用于金融级应用的MySQL分支版别。

相关链接: GreatSQL社区 Gitee GitHub Bilibili

GreatSQL社区:

MySQL一个关于derived table的bug描述与规避

社区有奖建议反馈: greatsql.cn/thread-54-1…

社区博客有奖征稿概况: greatsql.cn/thread-100-…

社区2022年度勋章获奖名单: greatsql.cn/thread-184-…

(对文章有疑问或许有独到见解都能够去社区官网提出或共享哦~)

技术交流群:

微信&QQ群:

QQ群:533341697

微信群:添加GreatSQL社区助手(微信号:wanlidbc )好友,待社区助手拉您进群。