本文是一篇个人学习总结,实现了在 Gitlab 使用 ChatGPT 实现 Code Review,主要是为了了解一下 ChatGPT。
最近,看到了一个使用 ChatGPT 进行 Code Review 的项目 ChatGPT-CodeReview。它实现了一个代码审查机器人,当你在 GitHub 上创建一个新的 Pull request 时,机器人会自动进行代码审查,审查信息将显示在 PR timeline / file changes 中。恰巧最近想了解一下 ChatGPT,因此照猫画虎在 Gitlab 上实现了一个类似的功能,来学习一下如何使用 ChatGPT。
数据请求
使用 Node.js 开发 ChatGPT 项目一般都会使用 chatgpt 这个库,它里面用到了 fetch 做数据请求,因此对 Node 版本有一定的要求,如果 Node.js 版本低于 18 需要做 polyfill。由于国内需要翻墙等原因,在调试过程中可能会出现各种意想不到的问题,比如 fetch 一直报错请求不到 ChatGPT,最后放弃使用了该库,使用 Axios 简单封装了一下实现接口请求:
import axios from 'axios';
import type { InternalAxiosRequestConfig, AxiosResponse, AxiosError } from 'axios';
const createRequest = (
host: string,
{ headers, data, params }: { headers?: Record<string, string>; data?: Record<string, any>, params?: Record<string, any> }
) => {
const instance = axios.create({
baseURL: host,
// timeout: 5000,
});
instance.interceptors.request.use(
function (config: InternalAxiosRequestConfig) {
// Do something before request is sent
if (params) {
config.params = { ...params, ...config.params };
}
if (headers) {
config.headers.set(headers);
}
if (data) {
config.data = { ...data, ...config.data };
}
return config;
},
function (error: AxiosError) {
// Do something with request error
return Promise.reject(error);
}
);
instance.interceptors.response.use(
function (response: AxiosResponse) {
return response;
},
function (error: AxiosError) {
// Any status codes that falls outside the range of 2xx cause this function to trigger
// Do something with response error
console.log(error);
return Promise.reject(error);
}
);
return instance;
};
export default createRequest;
只是简单封装了一下请求头和请求参数等,后面用它来实现请求 ChatGPT API 和 Gitlab API。
ChatGPT API
官方文档中对 ChatGPT API 有一些介绍,这里不做罗列了。如果不确定是否能访问得通 ChatGPT 的 API,可以简单 curl 一下,甚至 OPENAI_API_KEY 都不需要填:
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Hello!"}]
}'
如果访问通了,就会显示下面的信息,否则就会报超时错误。
{
"error": {
"message": "You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys.",
"type": "invalid_request_error",
"param": null,
"code": null
}
}
接口 /v1/chat/completions 可以与 ChatGPT 完成一次对话,我们在这里使用这个接口来询问 ChatGPT 进行 Code Review:
- 定义了一个 ChatGPT 类,在它的构造函数中使用相关参数生成了一个 request 对象
- sendMessage 方法负责将具体要 Reveiw 的代码和相关上下文发送给 ChatGPT
- 对外暴露了 codeReview 方法,通过调用 sendMessage 方法,完成 Reveiw 工作
import createRequest from './request';
import { logger } from './utils';
import type { AxiosInstance } from 'axios';
import { ChatGPTConfig } from './types';
export default class ChatGPT {
private language: string;
private request: AxiosInstance;
constructor(config: ChatGPTConfig) {
const host = 'https://api.openai.com';
this.request = createRequest(host, {
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${config.apiKey}`,
},
data: {
model: config.model || 'gpt-3.5-turbo',
temperature: +(config.temperature || 0) || 1,
top_p: +(config.top_p || 0) || 1,
presence_penalty: 1,
stream: false,
max_tokens: 1000,
},
});
this.language = config.language || 'Chinese';
}
private generatePrompt = (patch: string) => {
const answerLanguage = `Answer me in ${this.language},`;
return `Bellow is the gitlab code patch, please help me do a brief code review,${answerLanguage} if any bug risk and improvement suggestion are welcome
${patch}
`;
};
private sendMessage = async (msg: string) => {
const currentDate = new Date().toISOString().split('T')[0];
return this.request.post('/v1/chat/completions', {
messages: [
{
role: 'system',
content:
'You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.\n' +
'Knowledge cutoff: 2021-09-01\n' +
`Current date: ${currentDate}`,
},
{ role: 'user', content: msg, name: undefined },
],
});
};
public codeReview = async (patch: string) => {
if (!patch) {
logger.error('patch is empty');
return '';
}
const prompt = this.generatePrompt(patch);
const res = await this.sendMessage(prompt);
const { choices } = res.data;
if (Array.isArray(choices) && choices.length > 0) {
return choices[0]?.message?.content;
}
return '';
};
}
这段代码相对来说不并不复杂,主要是一些参数定义和接口请求。其中 codeReview 方法中的参数 patch,就是要 CodeReview 的代码片段,这段代码主要借鉴了前面提到的 ChatGPT-CodeReview 项目。
Gitlab API
主要涉及两个 API:
-
一个是获取 Merge Request 变更的代码:
/api/v4/projects/${projectId}/merge_requests/${mergeRequestIId}/changes
,在这个接口返回的 changes 字段中就是所有的代码变更,我们只需按文件维度传给 ChatGPT 即可。 -
一个是将评论写入 Merge Request:
/api/v4/projects/${projectId}/merge_requests/${mergeRequestIId}/discussions
,这里需要做一些处理工作,我们要把 ChatGPT 返回的结果作为评论写到 Merge Request 中每个文件的最后一行。Gitlab 的每一行 diff 其实是由三种状态组成 ‘+’ ‘-’ 和 ‘’:如果最后一行是 ‘+’,则给该接口传入 new_line 和 new_path;如果最后一行是 ‘-’ ,则给该接口传入 old_line 和 old_path;如果最后一行是 ‘’, 则 new_line、new_path 和 old_line、old_path 都要传入。对于 diff 的主要处理代码如下:
const parseLastDiff = (gitDiff: string) => {
const diffList = gitDiff.split('\n').reverse();
const lastLineFirstChar = diffList?.[1]?.[0];
const lastDiff =
diffList.find((item) => {
return /^@@ \-\d+,\d+ \+\d+,\d+ @@/g.test(item);
}) || '';
const [lastOldLineCount, lastNewLineCount] = lastDiff
.replace(/@@ \-(\d+),(\d+) \+(\d+),(\d+) @@.*/g, ($0, $1, $2, $3, $4) => {
return `${+$1 + +$2},${+$3 + +$4}`;
})
.split(',');
if (!/^\d+$/.test(lastOldLineCount) || !/^\d+$/.test(lastNewLineCount)) {
return {
lastOldLine: -1,
lastNewLine: -1,
};
}
const lastOldLine = lastLineFirstChar === '+' ? -1 : (parseInt(lastOldLineCount) || 0) - 1;
const lastNewLine = lastLineFirstChar === '-' ? -1 : (parseInt(lastNewLineCount) || 0) - 1;
return {
lastOldLine,
lastNewLine,
};
};
Gitlab 的 API 请求实现如下:
- 定义了 Gitlab 类,在 getChanges 和 postComment 两个方法中调用上面提到的两个 API
- target 参数用于限定了对哪些类型的文件进行 Review
- codeReview 方法是对外暴露的访问方法
import camelCase from 'camelcase';
import createRequest from './request';
import { logger } from './utils';
import type { GitlabConfig, GitlabDiffRef, GitlabChange } from './types';
import type { AxiosInstance } from 'axios';
const formatByCamelCase = (obj: Record<string, any>) => {
const target = Object.keys(obj).reduce((result, key) => {
const newkey = camelCase(key);
return { ...result, [newkey]: obj[key] };
}, {});
return target;
};
export default class Gitlab {
private projectId: string | number;
private mrIId: number | string;
private request: AxiosInstance;
private target: RegExp;
constructor({ host, token, projectId, mrIId, target }: GitlabConfig) {
this.request = createRequest(host, { params: { private_token: token } });
this.mrIId = mrIId;
this.projectId = projectId;
this.target = target || /\.(j|t)sx?$/;
}
getChanges() {
/** https://docs.gitlab.com/ee/api/merge_requests.html#get-single-merge-request-changes */
return this.request
.get(`/api/v4/projects/${this.projectId}/merge_requests/${this.mrIId}/changes`)
.then((res) => {
const { changes, diff_refs: diffRef, state } = res.data;
const codeChanges: GitlabChange[] = changes
.map((item: Record<string, any>) => formatByCamelCase(item))
.filter((item: GitlabChange) => {
const { newPath, renamedFile, deletedFile } = item;
if (renamedFile || deletedFile) {
return false;
}
if (!this.target.test(newPath)) {
return false;
}
return true;
})
.map((item: GitlabChange) => {
const { lastOldLine, lastNewLine } = parseLastDiff(item.diff);
return { ...item, lastNewLine, lastOldLine };
});
return {
state,
changes: codeChanges,
ref: formatByCamelCase(diffRef) as GitlabDiffRef,
};
})
.catch((error) => {
logger.error(error);
return {
state: '',
changes: [],
ref: {} as GitlabDiffRef,
};
});
}
postComment({
newPath,
newLine,
oldPath,
oldLine,
body,
ref,
}: {
newPath?: string;
newLine?: number;
oldPath?: string;
oldLine?: number;
body: string;
ref: GitlabDiffRef;
}) {
/** https://docs.gitlab.com/ee/api/discussions.html#create-a-new-thread-in-the-merge-request-diff */
return this.request
.post(`/api/v4/projects/${this.projectId}/merge_requests/${this.mrIId}/discussions`, {
body,
position: {
position_type: 'text',
base_sha: ref?.baseSha,
head_sha: ref?.headSha,
start_sha: ref?.startSha,
new_path: newPath,
new_line: newLine,
old_path: oldPath,
old_line: oldLine,
},
})
.catch((error) => {
logger.error(error);
});
}
async codeReview({
change,
message,
ref,
}: {
change: GitlabChange;
message: string;
ref: GitlabDiffRef;
}) {
const { lastNewLine = -1, lastOldLine = -1, newPath, oldPath } = change;
if (lastNewLine === -1 && lastOldLine === -1) {
logger.error('Code line error');
return;
}
const params: { oldLine?: number; oldPath?: string; newLine?: number; newPath?: string } = {};
if (lastOldLine !== -1) {
params.oldLine = lastOldLine;
params.oldPath = oldPath;
}
if (lastNewLine !== -1) {
params.newLine = lastNewLine;
params.newPath = newPath;
}
return await this.postComment({
...params,
body: message,
ref,
});
}
}
组合实现
- 使用 Gitlab 的实例获取 Merge Request 代码变更
- 使用 ChatGPT 的实例获取代码 Review 结果
- 然后 将 Review 结果写回到 Merge Request
async function run({
gitlabConfig,
chatgptConfig,
}: {
gitlabConfig: GitlabConfig;
chatgptConfig: ChatGPTConfig;
}) {
const gitlab = new Gitlab(gitlabConfig);
const chatgpt = new ChatGPT(chatgptConfig);
const { state, changes, ref } = await gitlab.getChanges();
if (state !== 'opened') {
logger.log('MR is closed');
return;
}
if (!chatgpt) {
logger.log('Chat is null');
return;
}
for (let i = 0; i < changes.length; i += 1) {
const change = changes[i];
const message = await chatgpt.codeReview(change.diff);
const result = await gitlab.codeReview({ message, ref, change });
logger.info(message, result?.data);
}
}
Review 结果
ChatGPT 会对代码的含义做一些解释和说明,并能够指出一些代码潜在的风险,以及代码风格问题等。比如,最初在写 ChatGPT 类名时,不小心拼写成了 ChatGTP,它也能给指出拼写错误。除此之外,对于一些好的实现,也会有一些阐述。整体上来看,对于那些很大的 MR 来说,先让 ChatGPT Review 一下,再人工介入去看一下会很省力。一图省千言,下面是 Review 的效果图:
这个仓库的代码放在了 Gitlab 官方免费版上的,所以能直接访问到 ChatGPT,国内的各个公司的 Gitlab 应该还是要翻墙。另外,可以通过 Gitlab 的 CI 去触发 Code Review。
感兴趣的话,可以访问仓库 github.com/ikoofe/chat… 查看源码。