运用普通SQL或许更难解决的问题是这样的问题:

哪些电影与给定的电影X有相同的艺人?

像平常相同,咱们在这个比如中运用sakila数据库。用SQL(例如PostgreSQL,具体来说)解决这个问题的或许办法是什么?下面的查询给出了每部电影的艺人概略:

SELECT
  film_id,
  array_agg(actor_id ORDER BY actor_id) actors
FROM film_actor
GROUP BY film_id

它产生的成果是这样的:

|film_id|actors                            |
|-------|----------------------------------|
|1      |{1,10,20,30,40,53,108,162,188,198}|
|2      |{19,85,90,160}                    |
|3      |{2,19,24,64,123}                  |
|4      |{41,81,88,147,162}                |
|5      |{51,59,103,181,200}               |
|6      |{21,23,62,108,137,169,197}        |
|...    |...                               |

请注意,在SQL中,数组的行为就像列表相同,也便是说,它们保持着自己的排序,所以清晰地对数组进行排序对于能够彼此比较艺人是很重要的。现在,咱们想从上面找到所有同享相同艺人集的电影:

WITH t AS (
  -- Previous query
  SELECT
    film_id,
    array_agg(actor_id ORDER BY actor_id) actors
  FROM film_actor
  GROUP BY film_id
)
SELECT 
  array_agg(film_id ORDER BY film_id) AS films,
  actors
FROM t
GROUP BY actors
ORDER BY count(*) DESC, films

成果便是现在:

|films   |actors                            |
|--------|----------------------------------|
|{97,556}|{65}                              |
|{1}     |{1,10,20,30,40,53,108,162,188,198}|
|{2}     |{19,85,90,160}                    |
|{3}     |{2,19,24,64,123}                  |
|{4}     |{41,81,88,147,162}                |
|{5}     |{51,59,103,181,200}               |
|{6}     |{21,23,62,108,137,169,197}        |
|...     |...                               |

所以,咱们可以看到,只要2部电影同享相同的艺人集,这些电影是FILM_ID IN (97, 556) 。(Sakila数据库有点无聊,由于数据集是生成的)。

运用MULTISET比较

虽然上面的内容现已相当酷了,但在这篇文章中,我想展现一下jOOQ 3.15MULTISET 支持的一个鲜为人知的功用,即它们可以彼此比较的现实。

并且正如SQL标准MULTISET 的性质相同,排序是不相关的,所以咱们不需要为这种比较添加任何清晰的ORDER BY 子句。现实上,这并不是100%的不相关。你可以 为投影意图对MULTISET ,所以排序将由jOOQ维护。但是当你在谓词中运用它们时,jOOQ将覆盖你的ORDER BY 子句。

运用jOOQ,咱们可以写:

ctx.select(FILM.FILM_ID, FILM.TITLE)
   .from(FILM)
   .where(
       multiset(
           select(FILM_ACTOR.ACTOR_ID)
           .from(FILM_ACTOR)
           .where(FILM_ACTOR.FILM_ID.eq(FILM.FILM_ID))
       ).eq(multiset(
           select(FILM_ACTOR.ACTOR_ID)
           .from(FILM_ACTOR)
           .where(FILM_ACTOR.FILM_ID.eq(97L))
       ))
   )
   .orderBy(FILM_ID)
   .fetch();

这比前一种形式的查询效率低一些,由于它从两个子查询中访问FILM_ACTOR 表,虽然其中只要一个是相关的。运用默许的JSONB 仿真,可以生成以下查询:

SELECT film.film_id, film.title
FROM film
WHERE (
  SELECT coalesce(
    jsonb_agg(jsonb_build_array(v0) ORDER BY t.v0),
    jsonb_build_array()
  )
  FROM (
    SELECT film_actor.actor_id AS v0
    FROM film_actor
    WHERE film_actor.film_id = film.film_id
  ) AS t
) = (
  SELECT coalesce(
    jsonb_agg(jsonb_build_array(v0) ORDER BY t.v0),
    jsonb_build_array()
  )
  FROM (
    SELECT film_actor.actor_id AS v0
    FROM film_actor
    WHERE film_actor.film_id = 97
  ) AS t
)
ORDER BY film.film_id

许诺,对于MULTISET ,不需要ORDER BY 子句,对于jOOQ代码来说,这仍然是实在的。然而,在幕后,jOOQ必须按JSON数组的内容排序,以确保两个MULTISET 的值是相同的,无论其次序如何。

其成果是与前面的成果显示的两个ID相同:

+-------+--------------+
|film_id|title         |
+-------+--------------+
|     97|BRIDE INTRIGUE|
|    556|MALTESE HOPE  |
+-------+--------------+

比较MULTISET_AGG,而不是

如果你喜欢运用连接和GROUP BY 来生成电影的艺人MULTISET ,你也可以用jOOQ来做。这一次,咱们运用的是

  • 隐式连接来简化对FILM.TITLE 的访问,从FILM_ACTOR
  • HAVING 子句中的一个MULTISET 谓词,运用MULTISET_AGG

下面是jOOQ的版别:

ctx.select(FILM_ACTOR.FILM_ID, FILM_ACTOR.film().TITLE)
   .from(FILM_ACTOR)
   .groupBy(FILM_ACTOR.FILM_ID, FILM_ACTOR.film().TITLE)
   .having(multisetAgg(FILM_ACTOR.ACTOR_ID).eq(multiset(
        select(FILM_ACTOR.ACTOR_ID)
        .from(FILM_ACTOR)
        .where(FILM_ACTOR.FILM_ID.eq(97L))
    )))
   .orderBy(FILM_ACTOR.FILM_ID)
   .fetch();

后退,生成的SQL看起来像这样:

SELECT film_actor.film_id, alias_75379701.title
FROM film_actor
  JOIN film AS alias_75379701
    ON film_actor.film_id = alias_75379701.film_id
GROUP BY film_actor.film_id, alias_75379701.title
HAVING jsonb_agg(
  jsonb_build_array(film_actor.actor_id) ORDER BY film_actor.actor_id
) = (
  SELECT coalesce(
    jsonb_agg(jsonb_build_array(v0) ORDER BY t.v0),
    jsonb_build_array()
  )
  FROM (
    SELECT film_actor.actor_id AS v0
    FROM film_actor
    WHERE film_actor.film_id = 97
  ) AS t
)
ORDER BY film_actor.film_id

请注意隐式连接是如何自动打开的,而HAVING 谓词再次运用了通常的JSONB 仿真,用于MULTISETMULTISET_AGG

代替办法

在上面的比如中,咱们比较了投射单列的MULTISET 表达式,换句话说,Result<Record1<Long>> 嵌套调集类型。jOOQ将始终确保你的查询类型查看和生成的SQL是正确的。

运用MULTISET 的一个代替办法是运用ARRAY_AGGARRAY (现在你必须再次清晰ORDER BY )。用jOOQ:

ctx.select(FILM_ACTOR.FILM_ID, FILM_ACTOR.film().TITLE)
   .from(FILM_ACTOR)
   .groupBy(FILM_ACTOR.FILM_ID, FILM_ACTOR.film().TITLE)
   .having(arrayAgg(FILM_ACTOR.ACTOR_ID)
        .orderBy(FILM_ACTOR.ACTOR_ID).eq(array(
            select(FILM_ACTOR.ACTOR_ID)
            .from(FILM_ACTOR)
            .where(FILM_ACTOR.FILM_ID.eq(97L))
            .orderBy(FILM_ACTOR.ACTOR_ID)
    )))
    .orderBy(FILM_ACTOR.FILM_ID)
    .fetch();

运用SQL:

SELECT film_actor.film_id, film.title
FROM film_actor
  JOIN film
    ON film_actor.film_id = film.film_id
GROUP BY film_actor.film_id, film.title
HAVING array_agg(film_actor.actor_id ORDER BY film_actor.actor_id) = 
  ARRAY (
    SELECT film_actor.actor_id
    FROM film_actor
    WHERE film_actor.film_id = 97
    ORDER BY film_actor.actor_id
  )
ORDER BY film_actor.film_id