记录一次Django索引优化的场景

106 阅读1分钟

场景:

  1. 抓取了58同城大概30万条的小区数据,在进行查询的时候:House58.objects.get(id_58=query_id)
  2. Model 的定义:
class House58(models.Model):
    id_58 = models.IntegerField(verbose_name=u'id_58', default=0)
    title = models.CharField(max_length=200, verbose_name=u'title', default='', null=True)
  1. 发现查询耗时竟然达到了 1s 多,很不能理解。耗时计算方法,通过在计算语句执行前后的 time.time() 或者 根据 time.clock() 来将获得
time start: 1538630867.454437
house_id: 3249
单次查询耗时:1.1995608806610107
1747
0.993959903717041
4031
0.8833990097045898
14218
0.8818538188934326
3351
0.8805160522460938
383960
0.8870620727539062
10480
0.7526187896728516
16502
0.6564738750457764
882593
0.6645219326019287
17714
0.5870890617370605
384529
0.5545010566711426
966445
0.465717077255249
966467
0.4913499355316162
965821
0.5021908283233643
965846
0.5379748344421387
966338
0.4857180118560791
966432
0.486743688583374
2424578
0.7989497184753418
time end: 1538630879.178473
time cost: 11.72403597831726
  1. 于是查询了常见的优化手段, 先在 id_58 字段上添加了db_index
class House58(models.Model):
    id_58 = models.IntegerField(verbose_name=u'id_58', default=0, db_index=True)
    title = models.CharField(max_length=200, verbose_name=u'title', default='', null=True)
  1. 再次查询
time start: 1538632284.394676
house_id: 3249
单次查询耗时:   0.0076982975006103516
没index的耗时: 1.1995608806610107
1747
0.012246131896972656
4031
0.009319305419921875
14218
0.009611129760742188
3351
0.00624394416809082
383960
0.006884098052978516
10480
0.009610891342163086
16502
0.007793903350830078
882593
0.010465145111083984
17714
0.008291006088256836
384529
0.005499839782714844
966445
0.006849050521850586
966467
0.006381988525390625
965821
0.007551908493041992
965846
0.005231142044067383
966338
0.004698991775512695
966432
0.007067203521728516
2424578
0.00585484504699707
1538632284.610323
总耗时:0.21564698219299316

但是一次查询耗时0.2 ms 仍然很长,可以考虑使用redis 做cache