简介
在这篇文章中,我们将看到为什么Spring Data findAll方法实际上是一个糟糕的反模式。
一张图片胜过1000字。


你会惊讶地发现,有多少次我看到这样的问题:从数据库中获取大量的数据,只是为了在Java中进行过滤,并丢弃绝大部分不需要的记录,以满足特定的业务需求。
反模式
首先,让我们从维基百科给出的反模式定义开始。
在软件工程、项目管理和业务流程中,反模式是对一个反复出现的问题的常见反应,通常是无效的,并有可能产生很大的反作用。
这正是你在Spring DataRepository 基础接口中提供一个findAll 方法时得到的结果,所有的数据访问对象将自动继承这个方法。
Spring Data Repository的层次结构
通常,当你创建一个Spring Data JPARepository 时,你的自定义接口会扩展JpaRepository 。
@Repository
public interface PostRepository
extends JpaRepository<Post, Long> {
}
然而,JpaRepository 提供了一系列的方法,它要么直接声明,要么从底层基础接口继承,如下图所示。


即使JpaReository 没有明确地提供一个findAll 方法,它仍然会从所有Spring Data模块共享的CrudRepository 继承一个。
Spring Data findAll反模式
findAll 方法的问题是,作为一个API设计者,你不应该为你的客户提供一个获取整个数据库表的方法,因为如果你这样做了,有些开发者确实会这样做。
让我们假设我们的系统中有以下Post 和Tag 实体。
List<Tag> tags = List.of(
new Tag()
.setId(1L)
.setName("JDBC"),
new Tag()
.setId(2L)
.setName("JPA"),
new Tag()
.setId(3L)
.setName("Hibernate")
);
tags.forEach(tag -> entityManager.persist(tag));
for (long i = 1; i <= POST_SIZE; i++) {
entityManager.persist(
new Post()
.setId(i)
.setTitle(
String.format(
"High-Performance Java Persistence, Part %d",
i
)
)
.addTag(tags.get((int) i % 3))
);
}
我们想获得所有JDBC或Hibernate帖子的标题,那么我们如何才能做到这一点?
最简单的方法当然是使用一个单一的SQL查询,看起来如下。
public List<String> findPostTitleByTags(List<String> tags) {
return entityManager.createNativeQuery("""
select p.title
from post p
where exists (
select 1
from post_tag pt
join tag t on pt.tag_id = t.id and pt.post_id = p.id
where t.name in (:tags)
)
order by p.id
""")
.setParameter("tags", tags)
.getResultList();
}
但是,并不是每个Java开发者都喜欢用一个简单的SQL查询来解决这个问题。他们中的一些人会不惜一切代价来避免编写任何SQL查询。
而且,由于我们离 "解决 "这个问题只差一个findAll 方法的调用,在有人提供以下解决方案之前,不会花很长时间。
List<String> postTitlesStreamRecords = postRepository.findAll()
.stream()
.filter(
post -> post.getTags().stream()
.map(Tag::getName)
.anyMatch(matchingTags::contains)
)
.sorted(Comparator.comparing(Post::getId))
.map(Post::getTitle)
.collect(Collectors.toList());
当执行findAll 工作方法时,Hibernate将执行以下查询。
Query:["select post0_.id as id1_1_, post0_.title as title2_1_ from post post0_"], Params:[()]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(1)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(2)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(3)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(4)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(5)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(6)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(7)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(8)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(9)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(10)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(11)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(12)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(13)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(14)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(15)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(16)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(17)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(18)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(19)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(20)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(21)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(22)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(23)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(24)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(25)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(26)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(27)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(28)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(29)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(30)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(31)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(32)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(33)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(34)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(35)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(36)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(37)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(38)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(39)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(40)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(41)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(42)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(43)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(44)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(45)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(46)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(47)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(48)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(49)]
Query:["select tags0_.post_id as post_id1_4_0_, tags0_.tag_id as tag_id2_4_0_, tag1_.id as id1_5_1_, tag1_.name as name2_5_1_ from post_tag tags0_ inner join tag tag1_ on tags0_.tag_id=tag1_.id where tags0_.post_id=?"], Params:[(50)]
不太好吧!
你之所以看到这么多的二次查询是由于filter 谓词需要获取tags 集合以匹配提供的标签。
防止Spring Data fetchAll反模式的发生
findAll 方法不应该被默认提供,这就是为什么 HibernateRepository从Hibernate Types项目中,废弃了它。
此外,如果你调用它,它将抛出一个 UnsupportedOperationException.
为了确保findAll 方法不会被错误地调用,只需让你的Repository 类扩展HibernateRepository 接口。
@Repository
public interface PostRepository extends
HibernateRepository<Post>,
JpaRepository<Post, Long>, CustomPostRepository {
}
并且,将com.vladmihalcea.spring.repository 包提供给@EnableJpaRepositories 注解。
@EnableJpaRepositories(
basePackages = {
"com.vladmihalcea.spring.repository",
...
}
)
这样一来,你就会立刻发现,findAll 方法是应该避免的。


然而,如果你确定对于特定的使用情况,有一个findAll 方法是没有问题的,那么你可以简单地覆盖它,如下所示。
@Repository
public interface PostRepository extends HibernateRepository<Post>,
JpaRepository<Post, Long>, CustomPostRepository {
@Query("from Post")
@Override
List<Post> findAll();
}
这样一来,你就会明确说明确实需要findAll 方法,尽管我怀疑在大多数非琐碎的应用中会出现这种情况。
Hibernate隐式多态查询反模式
findAll 反模式让我想起了Hibernate的一个不太知名的 "功能",它允许你用一个查询来选择整个数据库。
//Don't try this at home!
List<Object> objects = entityManager.createQuery("""
select o
from java.lang.Object o
""")
.getResultList();
这个查询之所以有效,是因为Hibernate提供的隐式多态查询功能。你之所以没有看到这种反模式在实践中的应用,是因为Hibernate的开发者认为鼓励一次性获取整个数据库并不是一个好主意,所以他们在文档中省略了这一点。
在内部,Hibernate团队在测试中确实使用了这样的查询。
protected void cleanupTestData() {
doInHibernate(this::sessionFactory, session -> {
session.createQuery(
"delete from java.lang.Object"
).executeUpdate();
});
}
但是,由于这只限于Hibernate的集成测试,所以使用这样一个危险的查询是没有问题的。而且,有趣的是,并不是所有的Hibernate测试都使用它。有些Hibernate集成测试只是在测试之间删除并重新创建DB模式,所以删除所有记录是没有意义的。
结论
Spring Data的findAll方法完全遵循了维基百科对反模式的定义,如果你默认避免它,那就更好了,因为这样一来,你可以防止一些非常低效的数据访问操作,在你的生产系统中拥有足够多的数据之前,这些操作可能会被忽略掉。
尽管如此,仅仅因为你废弃了默认的findAll 方法,这并不意味着你不能明确地声明它。然而,这样一来,你会通知其他开发者,你真的知道自己在做什么,而且你仍然可以享受在你的应用程序内存中加载整个数据库表。