PostgreSQL 14 使用pgvector插件实现人脸特征值比较

2,125 阅读1分钟

安装postgresql

下载yum.repo包
rpm -Uvh https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
yum 安装postgrel-server
yum install postgresql-server
初始化postgresql
/usr/pgsql-14/bin/postgresql-14-setup initdb

yum 安装postgrel-dev

yum install postgresql14-devel.x86_64

安装pgvector插件

cd /tmp
git clone --branch v0.4.2 https://github.com/pgvector/pgvector.git
cd pgvector
make & make install

启用pgvector插件

客户端连接到postgresql

执行 CREATE EXTENSION vector; 启动vector 插件

使用pgvector

CREATE TABLE public.tfacerecord (
	f_id varchar NULL,
	f_uuid varchar NULL,
	f_feature vector NULL
);
f_feature为特征值字段

#创建索引
L2 distance(欧拉距离)
CREATE INDEX ON items USING ivfflat (embedding vector_l2_ops) WITH (lists = 100);
Inner product(点积距离)
CREATE INDEX ON items USING ivfflat (embedding vector_ip_ops) WITH (lists = 100);
Cosine distance(余弦距离)
CREATE INDEX ON items USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

测试

有索引耗时29ms

没有索引耗时32s

注意事项

走ivfflat索引 需要使用order by 来进行执行
根据orderby 结果再来根据距离计算结果进行筛选
select * from tfacerecord t where 
(f_feature <#> '[0.015339759,0.016458405,.....,-0.010708194,0.030958941,-0.010478517]'::vector)<-0.5
直接走余弦计算是不走索引,会将全表扫描后计算结果再进行筛选