Spring?Data?Exists查询最佳方法编写示例

来源：jb51　　时间：2022/8/1 19:09:34　　对本文有异议

简介

在这篇文章中，我将向你展示编写Spring Data Exists查询的最佳方法，从SQL的角度来看，它是高效的。

在做咨询的时候，我遇到了几个常用的选项，而开发者却不知道其实还有更好的选择。

领域模型

让我们假设我们有以下Post 实体。

slug 属性是一个业务键，意味着它有一个唯一的约束，为此，我们可以用下面的注解来注解它 @NaturalIdHibernate注解。

@Entity
@Entity
@Table(
    name = "post",
    uniqueConstraints = @UniqueConstraint(
        name = "UK_POST_SLUG",
        columnNames = "slug"
    )
)
public class Post {
    @Id
    private Long id;
    private String title;
    @NaturalId
    private String slug;
    public Long getId() {
        return id;
    }
    public Post setId(Long id) {
        this.id = id;
        return this;
    }
    public String getTitle() {
        return title;
    }
    public Post setTitle(String title) {
        this.title = title;
        return this;
    }
    public Post setSlug(String slug) {
        this.slug = slug;
        return this;
    }
}

如何不使用Spring Data来写Exists查询？

首先，让我们从各种方法开始，这些方法虽然很流行，但你最好避免使用。

用findBy查询模拟存在

Spring Data提供了一种从方法名派生查询的方法，所以你可以写一个findBy 查询来模拟存在，就像这样。

@Repository
public interface PostRepository 
        extends JpaRepository&lt;Post, Long&gt; {
    Optional&lt;Post&gt; findBySlug(String slug);   
}

由于findBySlug 方法是用来获取Post 实体的，我见过这样的情况：这个方法被用来进行平等检查，就像下面的例子。

assertTrue(
    postRepository.findBySlug(slug).isPresent()
);

这种方法的问题在于，实体的获取实际上只是为了检查是否有一个与所提供的过滤条件相关的记录。

SELECT 
    p.id AS id1_0_,
    p.slug AS slug2_0_,
    p.title AS title3_0_
FROM 
    post p
WHERE 
    p.slug = 'high-performance-java-persistence'

使用fidnBy 查询来获取实体以检查其存在性是一种资源浪费，因为如果你在slug 属性上有一个索引的话，你不仅不能使用覆盖查询，而且你必须通过网络将实体结果集发送到JDBC驱动程序，只是默默地将其丢弃。

使用实例查询来检查存在性

另一个非常流行的，但效率低下的检查存在性的方法是使用Query By Example功能。

assertTrue(
    postRepository.exists(
        Example.of(
            new Post().setSlug(slug),
            ExampleMatcher.matching()
                .withIgnorePaths(Post_.ID)
                .withMatcher(Post_.SLUG, exact())
        )
    )
);

Query By Example功能建立了一个Post 实体，在匹配所提供的ExampleMatcher 规范给出的属性时，该实体将被用作参考。

当执行上述Query By Example方法时，Spring Data会生成与之前findBy 方法所生成的相同的SQL查询。

SELECT 
    p.id AS id1_0_,
    p.slug AS slug2_0_,
    p.title AS title3_0_
FROM 
    post p
WHERE 
    p.slug = 'high-performance-java-persistence'

虽然Query By Example功能对于获取实体可能很有用，但是将其与Spring Data JPA的exists 通用方法Repository ，效率并不高。

如何使用Spring Data编写Exists查询

有更好的方法来编写Spring Data Exists查询。

用existsBy查询方法检查存在性

Spring Data提供了一个existsBy 查询方法，我们可以在PostRepository ，定义如下。

@Repository
public interface PostRepository 
        extends JpaRepository&lt;Post, Long&gt; {
    boolean existsBySlug(String slug);
}

当在PostgreSQL或MySQL上调用existsBySlug 方法时。

assertTrue(
    postRepository.existsBySlug(slug)
);

Spring Data会生成以下SQL查询。

SELECT 
    p.id AS col_0_0_
FROM 
    post p
WHERE 
    p.slug = 'high-performance-java-persistence'
LIMIT 1

这个查询的PostgreSQL执行计划看起来如下。

Limit  
    (cost=0.28..8.29 rows=1 width=8) 
    (actual time=0.021..0.021 rows=1 loops=1)
  ->  Index Scan using uk_post_slug on post p  
      (cost=0.28..8.29 rows=1 width=8) 
      (actual time=0.020..0.020 rows=1 loops=1)
        Index Cond: ((slug)::text = 'high-performance-java-persistence'::text)
Planning Time: 0.088 ms
Execution Time: 0.033 ms

还有，MySQL的，像这样。

-> Limit: 1 row(s)  
   (cost=0.00 rows=1) 
   (actual time=0.001..0.001 rows=1 loops=1)
    -> Rows fetched before execution  
       (cost=0.00 rows=1) 
       (actual time=0.000..0.000 rows=1 loops=1)

所以，这个查询非常快，而且额外的LIMIT 操作并不影响性能，因为它反正是在一个记录的结果集上完成。

用COUNT SQL查询来检查存在性

模拟存在性的另一个选择是使用COUNT查询。

@Repository
public interface PostRepository 
        extends JpaRepository<Post, Long> {
    @Query(value = """
        select count(p.id) = 1 
        from Post p
        where p.slug = :slug
        """
    )
    boolean existsBySlugWithCount(@Param("slug") String slug);
}

COUNT 查询在这种特殊情况下可以正常工作，因为我们正在匹配一个UNIQUE列值。

然而，一般来说，对于返回有多条记录的结果集的查询，你应该倾向于使用EXISTS ，而不是COUNT ，正如Lukas Eder在这篇文章中所解释的那样。

在PostgreSQL和MySQL上调用existsBySlugWithCount 方法时。

assertTrue(
    postRepository.existsBySlugWithCount(slug)
);

Spring Data会执行以下SQL查询。

SELECT 
    count(p.id) > 0 AS col_0_0_
FROM 
    post p
WHERE 
    p.slug = 'high-performance-java-persistence'

而且，这个查询的PostgreSQL执行计划看起来如下。

Aggregate  
  (cost=8.29..8.31 rows=1 width=1) 
  (actual time=0.023..0.024 rows=1 loops=1)
  ->  Index Scan using uk_post_slug on post p  
      (cost=0.28..8.29 rows=1 width=8) 
      (actual time=0.019..0.020 rows=1 loops=1)
        Index Cond: ((slug)::text = 'high-performance-java-persistence'::text)
Planning Time: 0.091 ms
Execution Time: 0.044 ms

而在MySQL上。

-> Aggregate: count('1')  
   (actual time=0.002..0.002 rows=1 loops=1)
    -> Rows fetched before execution  
       (cost=0.00 rows=1) 
       (actual time=0.000..0.000 rows=1 loops=1)

尽管COUNT操作有一个额外的Aggregate步骤，但由于只有一条记录需要计算，所以这个步骤非常快。

用CASE WHEN EXISTS SQL查询来检查存在性

最后一个模拟存在的选项是使用CASE WHEN EXISTS本地SQL查询。

@Repository
public interface PostRepository 
        extends JpaRepository<Post, Long> {
    @Query(value = """
        SELECT 
            CASE WHEN EXISTS (
                SELECT 1 
                FROM post 
                WHERE slug = :slug
            ) 
            THEN 'true' 
            ELSE 'false'
            END
        """,
        nativeQuery = true
    )
    boolean existsBySlugWithCase(@Param("slug") String slug);
}

而且，我们可以像这样调用existsBySlugWithCase 方法。

assertTrue(
    postRepository.existsBySlugWithCase(slug)
);

这个查询的PostgreSQL执行计划看起来如下。

Result  
  (cost=8.29..8.29 rows=1 width=1) 
  (actual time=0.021..0.022 rows=1 loops=1)
  InitPlan 1 (returns $0)
    ->  Index Only Scan using uk_post_slug on post  
          (cost=0.27..8.29 rows=1 width=0) 
          (actual time=0.020..0.020 rows=1 loops=1)
          Index Cond: (slug = 'high-performance-java-persistence'::text)
          Heap Fetches: 1
Planning Time: 0.097 ms
Execution Time: 0.037 ms

而在MySQL上。

-> Rows fetched before execution  
   (cost=0.00 rows=1) 
   (actual time=0.000..0.000 rows=1 loops=1)
-> Select #2 (subquery in projection; run only once)
    -> Limit: 1 row(s)  
        (cost=0.00 rows=1) 
        (actual time=0.000..0.001 rows=1 loops=1)
        -> Rows fetched before execution  
           (cost=0.00 rows=1) 
           (actual time=0.000..0.000 rows=1 loops=1)