django REST framework 在django程序中不只是帮助提供REST风格的接口;还有接口的系列化输出;权限管理等; 不过在序列化过程中,如果什么都使用默认的方式,可能导致序列化时多次查询数据库,造成性能问题(示例来源):
- model:
from django.db import models
class Category(models.Model):
name = models.CharField(max_length=100)
class Blogpost(models.Model):
title = models.CharField(max_length=100)
categories = models.ManyToManyField(Category)
- url + view:
# urls.py
from rest_framework import routers
from . import views
router = routers.DefaultRouter()
router.register(r'blogposts', views.BlogpostViewSet)
---------------------------------------------------------
# views.py
from rest_framework import viewsets
class BlogpostViewSet(viewsets.ModelViewSet):
queryset = Blogpost.objects.all().order_by('date')
serializer_class = serializers.BlogpostSerializer
等这些做完,调用接口,我们在控制台将sql输出的话,就会看到像这样的东西:
# sql输出设置
LOGGING = {
...
"loggers": {
...
"django.db.backends": {
"handlers": ["console"],
"level": "DEBUG",
"propagate": False,
}
}
}
-- sql 输出结果
SELECT "app_blogpost"."id", "app_blogpost"."title" FROM "app_blogpost";
SELECT "app_category"."id", "app_category"."name" FROM "app_category" INNER JOIN "app_blogpost_categories" ON ("app_category"."id" = "app_blogpost_categories"."category_id") WHERE "app_blogpost_categories"."blogpost_id" = 1025;
SELECT "app_category"."id", "app_category"."name" FROM "app_category" INNER JOIN "app_blogpost_categories" ON ("app_category"."id" = "app_blogpost_categories"."category_id") WHERE "app_blogpost_categories"."blogpost_id" = 193;
SELECT "app_category"."id", "app_category"."name" FROM "app_category" INNER JOIN "app_blogpost_categories" ON ("app_category"."id" = "app_blogpost_categories"."category_id") WHERE "app_blogpost_categories"."blogpost_id" = 757;
SELECT "app_category"."id", "app_category"."name" FROM "app_category" INNER JOIN "app_blogpost_categories" ON ("app_category"."id" = "app_blogpost_categories"."category_id") WHERE "app_blogpost_categories"."blogpost_id" = 853;
SELECT "app_category"."id", "app_category"."name" FROM "app_category" INNER JOIN "app_blogpost_categories" ON ("app_category"."id" = "app_blogpost_categories"."category_id") WHERE "app_blogpost_categories"."blogpost_id" = 1116;
SELECT "app_category"."id", "app_category"."name" FROM "app_category" INNER JOIN "app_blogpost_categories" ON ("app_category"."id" = "app_blogpost_categories"."category_id") WHERE "app_blogpost_categories"."blogpost_id" = 1126;
SELECT "app_category"."id", "app_category"."name" FROM "app_category" INNER JOIN "app_blogpost_categories" ON ("app_category"."id" = "app_blogpost_categories"."category_id") WHERE "app_blogpost_categories"."blogpost_id" = 964;
SELECT "app_category"."id", "app_category"."name" FROM "app_category" INNER JOIN "app_blogpost_categories" ON ("app_category"."id" = "app_blogpost_categories"."category_id") WHERE "app_blogpost_categories"."blogpost_id" = 591;
SELECT "app_category"."id", "app_category"."name" FROM "app_category" INNER JOIN "app_blogpost_categories" ON ("app_category"."id" = "app_blogpost_categories"."category_id") WHERE "app_blogpost_categories"."blogpost_id" = 1112;
SELECT "app_category"."id", "app_category"."name" FROM "app_category" INNER JOIN "app_blogpost_categories" ON ("app_category"."id" = "app_blogpost_categories"."category_id") WHERE "app_blogpost_categories"."blogpost_id" = 1034;
...
是不是吓了一跳?在序列化时,原生的方式是逐条拿出来,再去查找对应的关联对象;笔者的业务表中有四五个外键关联对象,点完那个接口后好几屏的sql铺面而来;如果数据量大的话,这对数据库绝对是个很大的压力!后面去网上找了下,找到了比较适合的解决方案,下面分享下: 1. 序列化中需要使用到一对多的对象中的字段的:
# views.py
queryset = BusinessTable.objects.all().select_related(
'one_to_one_field_1',
'one_to_one_field_2',
...
)
---------------------
# serializer
class xxxSerializer(serializers.ModelSerializer):
target_field = serializers.SerializerMethodField()
def get_target_field(self, obj):
...
return obj.one_to_one_field_1.target_field
2. 序列化中需要使用到一对多,多对多的对象中的字段的
queryset = BusinessTable.objects.all().prefetch_related(
'many_to_many_field_1',
'many_to_many_field_2',
...
)
---------------------
# serializer
class xxxSerializer(serializers.ModelSerializer):
target_field_set = serializers.SerializerMethodField()
def get_target_field(self, obj):
...
return [it.arget_field for it in obj.many_to_many_field_1.all()]
3. 序列化中需要对多对多的字段聚合的
queryset = BusinessTable.objects.all().annotate(
cnt_xxx=Count("many_to_many_field")
)
---------------------
# serializer
class xxxSerializer(serializers.ModelSerializer):
target_field = serializers.SerializerMethodField()
def get_target_field(self, obj):
return obj.cnt_xxx
4. 总结
queryset = BlessingBlog.objects.all().annotate(
cnt_xxx=Count("xxx") # 需要聚合的新命名字段,因annotate对语句的解释,django其实会在外部语句执行联合查询,再分组,而不是直接走子查询。所以这里其实是取巧,所以要这样用只能加一个,否则要出错,要用在子查询来聚合的可以用RowSql或extra!
).select_related( # 一对一关联表数据
'yyy1',
'yyy2',
).prefetch_related( # 一对多,多对多关联表数据
'xxx1',
'xxx3',
).order_by('zzz1', 'zzz2')
最后,如果使用了annotate,系统给出的查询语句对聚合字段的处理不是通过子查询实现的,是通过联合查询再分组来实现的,这对性能也有很大的影响,而且聚合多个字段的话还要注意去重,如果对性能要求高的,可以考虑自己手写查询语句。