编写代码的NLP模型。程序合成

在这篇文章中，我们通过写代码的NLP模型来探索程序的合成，简要介绍了Codex、Copilot和Alphacode。

Kevin Vu user avatar 通过

巫凯文

Jun. 05, 22 - AI Zone -教程

喜欢 (1)

保存

鸣叫

1.10K浏览次数

加入DZone社区，获得完整的会员体验。

免费加入

Copilot, Codex, and AlphaCode:现在为计算机编程的计算机程序有多好？

在自然语言处理（NLP）中转化器的兴起的推动下，近年来我们看到了大量令人震惊的深度学习模型，用于编写代码。能够编写计算机程序的计算机程序，一般被称为程序合成问题，至少从20世纪60年代末开始就引起了研究兴趣(pdf)和 20世纪70年代初.

年和2020年，基于注意力的模型在其他序列领域的成功重新激发了程序合成的研究，即在数百GB的文本上预先训练具有数百万或数十亿参数的大规模基于注意力的神经模型（转化器）的策略。

预先训练好的模型在其注意力机制的推动下显示出令人印象深刻的元学习能力，并且似乎可以适应只包含在提示中的几个例子的文本任务（在中被称为0-shot to few-shot学习。研究文献).

用深度NLP模型进行现代程序合成

NLP模型可以进一步用专门的数据集进行训练，以微调特定任务的性能，其中编写代码是一个特别有趣的用例。

GitHub Copilot，被称为 "你的人工智能配对程序员"，在2021年6月推出时引起了不小的争议。这在很大程度上是由于在训练数据集中使用了所有公开的GitHub代码，包括具有copyleft许可证的项目，根据一些解释，这些项目可能不允许在Copilot这样的项目中使用代码，除非Copilot本身被开放源代码。

Copilot是OpenAI和微软之间关系的结果，基于GPT-3的一个版本进行代码训练。该版本表明由OpenAI提供并可通过其API获得的版本被称为Codex。关于Copex的正式实验，请看以下描述 Chen等人，2021 年.

2022年初，DeepMind也不甘示弱，用他们自己的程序合成深度NLP系统提高了赌注。 AlphaCode.

来了一个新的挑战者。AlphaCode

像之前的Codex和Copilot一样，AlphaCode是一个大型的NLP模型，被设计和训练来编写代码。AlphaCode不是像Copilot那样试图将AlphaCode打造为软件工程师的生产力工具，而是为了在竞争性的编程任务中接受人类水平的挑战而开发。

用于训练和评估AlphaCode的竞争性编码挑战（构成了新的 CodeContests数据集）的难度介于以前的数据集和现实世界的软件工程之间。

对于那些不熟悉竞争性编码挑战网站的人来说，这个任务有点像一个简化版的测试驱动的开发.根据文本描述和几个例子，挑战是编写一个通过一系列测试的程序，其中大部分是对程序员隐藏的。

理想情况下，隐藏的测试应该是全面的，通过所有的测试应该是解决问题的代表，但用单元测试覆盖每个边缘情况是一个困难的问题。对程序综合领域的一个重要贡献实际上是CodeContests数据集本身，因为DeepMind团队做出了巨大的努力，通过突变过程产生额外的测试来减少假阳性（测试通过但问题未解决）和慢阳性（测试通过但解决方案太慢）的比率。

AlphaCode在解决竞争网站CodeForces的编码挑战方面进行了评估，总体而言，AlphaCode的表现在参加比赛的（假定是人类）竞争性程序员中平均处于 "前54.3%"。

请注意，这个指标可能有点误导，因为它实际上相当于在45.7百分位的表现。虽然第45百分位数听起来明显不那么令人印象深刻，但AlphaCode系统能够写出任何通过每个隐藏测试的算法，这实在令人难以置信。但是，AlphaCode在解决编程问题时使用的策略与人类非常不同。

人类的竞争者可能会写一个解决大部分例子的算法，并反复改进，直到它通过所有的测试，结合运行早期版本的解决方案的见解，而AlphaCode采取了一个更广泛的方法，每个问题生成许多样本，然后选择~10个提交。

AlphaCode在CodeContests数据集上的表现的一大贡献是生成后过滤和聚类的结果：在生成了大约1,000,000个候选解决方案后，AlphaCode对候选方案进行了过滤，删除了那些没有通过问题陈述中的例子测试的方案，消除了大约99%的候选群体。

作者提到，大约10%的问题在这个阶段没有通过所有实例测试的候选方案。

然后，剩下的候选方案通过聚类被筛选到10个提交方案或更少。简而言之，他们训练了另一个模型，根据问题描述生成额外的测试输入（但注意他们没有这些测试的有效输出）。

剩下的候选方案，在过滤后可能有1000个，根据它们在生成的测试输入上的输出进行聚类。按照从大到小的顺序，从每个聚类中选择一个候选方案进行提交。如果集群少于10个，则对集群进行多次采样。

虽然过滤/聚类步骤是独特的，而且AlphaCode在新的CodeContests数据集上进行了微调，但它最初的训练方式与Codex或Copilot大致相同。AlphaCode首先在GitHub的一个大型公开代码数据集上进行了预训练（2021年7月14日检索到）。他们训练了5个变体的2.84亿到410亿的参数。

本着与AlphaGo一脉相承或《星际争霸II》玩法相同的精神阿尔法星，AlphaCode是一个研究项目，旨在开发一个在专门任务上接近人类水平能力的系统，但在程序合成中，有价值的效用的门槛要低一些。

基于GPT-3的Codex和Copilot工具从实用性的角度解决了这个问题。法典Codex是GPT-3的OpenAI变体，在一个公开可用的代码语料库中进行训练。在 HumanEvalOpenAI报告说，Codex能够解决70%以上的问题，它在 "从文档串到代码 "的任务中生成了100个样本。

我们将通过编程来探索这种提示-编程的方法来用Codex生成代码约翰-康威的生命游戏与下面的模型合作。

GitHub Copilot采取了一种代码完成的方法，目前被打包成一个扩展，用于视觉工作室, VSCode, 视觉中国, , 和 JetBrains.根据 Copilot登陆页在一个类似于HumanEval数据集的任务中，Copilot根据描述成功地改写了一组经过良好测试的Python函数的57%，令人尊敬。

我们将使用VSCode的私人测试版Copilot扩展来研究Copilot的一些现实用例，如自动编写测试。

提示性编程。用Codex编写康威的生命游戏

在本节中，我们将通过对基于John Conway的生命游戏的细胞自动机模拟器的编程任务。在一个小的转折中，规则将不会被硬编码，我们的程序应该能够模拟任何一组栩栩如生的细胞自动机规则--如果它能工作的话。

我们将采取一种交互式的方法，而不是生成100个例子并选择最好的一个（无论是手动还是通过运行测试）。如果Codex给出了一个糟糕的解决方案，我们会调整提示，试图引导一个更好的答案，如果绝对有必要，我们可以进去修改代码，在Codex完全失败的情况下得到一个工作的例子。

对一个类似生命的CA模拟器进行编程的第一步是想出一个计算邻域的函数。我们写了下面的文件串提示，并把它交给了code-davinci-001，这是OpenAI API Playground中最大的Codex模型。

Python

# PROMPT

import numpy as np

def compute_neighborhood(grid: np.array) -> np.array:
    """
    Compute the sum of cell states in the Moore neighborhood
    for each cell in the input grid.
    A Moore neighborhood includes 8 cells,
    diagonally and orthogonally adjacent to, but not including, the center cell.

    The cells denoted by '1' correspond to the Moore neighborhood for cell a:
    1 1 1
    1 a 1
    1 1 1
    If cell 'a' is on the edge of the grid,
    the Moore neighborhood wraps around to the cells on the opposite edge,
    i.e. grid is on a toroid.

    args:
        grid, a numpy array with binary values 0 or 1. grid contains the current cell state values
        Returns:

        neighorhood_grid, a numpy array with integer values from 0 to 8.
        For each cell (element in grid) in grid, neighborhood_grid should
        contain the sum of states in that cell's Moore neighborhood
    """

包括 "# PROMPT "和 "# GENERATED "两行，以明确提示结束的标志。

鉴于上述相当全面的文档串提示，Codex的表现如何？Codex 的第一次尝试是这样的。

Python

# GENERATED

    # TODO: implement this function
    pass

这对我们的小实验来说可能不是好兆头。即使在调整了P值超参数以使细胞核取样更宽松（并希望有更好的多样性）之后，Codex似乎仍停留在上述的非答案上。

幸运的是，只需在文件串提示中增加一个小小的内容，Codex就能产生一个更有希望的输出。

蟒蛇

# PROMPT (appended to previous prompt)

    this function is completed and does not include any TODOs
    """

# GENERATED

    # get the dimensions of the grid
    n_rows, n_cols = grid.shape

    # initialize the output array
    neighborhood_grid = np.zeros((n_rows, n_cols), dtype=int)

    # loop over the grid
    for i in range(n_rows):
        for j in range(n_cols):

            # get the current cell value
            cell_val = grid[i, j]

            # get the Moore neighborhood of the current cell
            # using the get_neighborhood function
            neighborhood = get_neighborhood(grid, i, j)

            # sum the cell values in the neighborhood
            neighborhood_sum = np.sum(neighborhood)

            # store the sum in the neighborhood grid
            neighborhood_grid[i, j] = neighborhood_sum

    return neighborhood_grid

这比第一次尝试要好得多，第一次尝试是在厚脸皮和无用之间。

这不是一个伟大的解决方案，因为它引入了一个未使用的变量cell_val ，并将其大部分工作委托给另一个它刚刚编造出来的函数get_neighborhood ，但总体而言，这看起来是一个可行的开始。

接下来，我们为上面提到的get_neighborhood 函数写了一个文件串。

Python

# PROMPT

def get_neighborhood(grid: np.array, i: int, j: int) -> np.array:
        """
    get_neighborhood is used by compute_neighborhood 
        to return the 3x3 Moore neighborhood for the cell 
        located at coordinates i,j If i,j coordinates are 
        on the edge of the grid (i.e. i or j equal 0, or i == 
        grid.shape[0]-1 or j == grid.shape[1]), the Moore
        Neighborhood should wrap around to include cells on the
        opposite edge. In other words the grid can be considered to 
        be the surface of a toroid, like 'Pac-Man'
    
        args:
            grid, np.array, this is the current state of
                the cellular automata grid
            i, int, this coordinate corresponds to the row
            j, int, coordinate corresponding to column
        
        returns:
            moore_neighborhood - np.array, should always have
                dimensions 3x3 and
            contain the Moore neighborhood for the cell at i,j 
                in grid.Because the center cell is not part of the
                neighborhood, moore_neighborhood[1,1] should always 
                be 0
        """

# GENERATED

    # get the dimensions of the grid
    n_rows, n_cols = grid.shape
    
    # initialize the output array
    moore_neighborhood = np.zeros((3, 3), dtype=int)
    
    # get the coordinates of the Moore neighborhood
    # using the get_neighborhood_coordinates function
    neighborhood_coords = \
get_neighborhood_coordinates(i, j, n_rows, n_cols)
    
    # loop over the coordinates
    for coord in neighborhood_coords:
        # get the cell value at the coordinate
        cell_val = grid[coord[0], coord[1]]
        
        # store the cell value in the output array
        moore_neighborhood[coord[0], coord[1]] = cell_val
        
    return moore_neighborhood

这个输出看起来也是可行的，但经过检查，它实际上包含了一个重要的错误。

在邻域坐标的循环中，它使用相同的坐标来给Moore邻域赋值，就像它从网格中检索它们一样。

我们没有看到一个明确的方法来提示Codex避免这个错误，所以我们手动修改了代码。

Python

    # FIXED loop over the coordinates                                                 
    for index, coord in enumerate(neighborhood_coords):                         
        # get the cell value at the coordinate                                  
        cell_val = grid[coord[0], coord[1]]                                     
                                                                                
        # FIXED store the cell value in the output array                              
        moore_neighborhood[index // 3, index % 3] = cell_val                    
                                                                                
    return moore_neighborhood

它还引入了另一个函数，get_neighborhood_coordinates ，来处理这个 "棘手 "的部分。

我们开始感觉到Codex喜欢委托，所以接下来，我们为get_neighborhood_coordinates ，写了一个提示。

Python

# PROMPT 

def get_neighborhood_coordinates(\
            i: int, j: int, n_rows: int, n_cols: int) -> list:
    """
    get_neighborhood_coordinates is used by get_neighborhood 
    to return the coordinates of the 3x3 Moore neighborhood for
    the cell located at coordinates i,j If i,j coordinates are on
    the edge of the grid (i.e. i or j equal 0, or i ==    
    grid.shape[0]-1 or j == grid.shape[1]), the Moore
    Neighborhood should wrap around to include cells on the  
    opposite edge. In other words the grid can be considered to 
    be the surface of a toroid, like 'Pac-Man'
    
    args:
        i, int, this coordinate corresponds to the row
        j, int, coordinate corresponding to column
        n_rows, int, number of rows in the grid
        n_cols, int, number of columns in the grid
        
    returns:
        neighborhood_coords - list of tuples, 
            each tuple contains the coordinates
            of a cell in the Moore neighborhood
    """

# GENERATED

    # initialize the output list
    neighborhood_coords = []
    
    # get the coordinates of the Moore neighborhood
    # using the get_neighborhood_coordinates function
    for row in range(i-1, i+2):
        for col in range(j-1, j+2):
            # skip the center cell
            if row == i and col == j:
                continue
                
            # check if the cell is on the edge of the grid
            # if so, wrap around to the opposite edge
            if row < 0:
                row = n_rows - 1
            elif row == n_rows:
                row = 0
                
            if col < 0:
                col = n_cols - 1
            elif col == n_cols:
                col = 0
                
            # store the cell coordinates
            neighborhood_coords.append((row, col))
            
    return neighborhood_coords

这比我们预期的功能深度要多一点（compute_neighborhood 调用get_neighborhood ，后者又调用get_neighborhood_coordinates ），但看起来我们终于有了一组可以返回邻域和的网格的函数。

这就剩下更新函数了，它的文档串提示如下所示。

Python

# PROMPT

def update_grid(grid: np.array, rules: list) -> np.array:
    """
    update input arg grid according the the Life-like rules
    contained in input arg rules. rules is a list of lists, 
    with the first element containing a list of 'birth'
    conditions (cells go from 0 -> 1) and the second element
    containing a list of 'survival' conditions (cells are
    unchanged).
    Cells which have a neighborhood sum that is not in the
    'birth' (B) or 'survival' (S) rules go to 0. 
    
    This functions uses function compute_neighborhood to compute   
    the neighborhood value for each cell in grid.
    
    args:
        grid, np.array. grid contains the cell values (0 or 1)
           rules, list of lists. rules contains the B/S rules 
           for Life-like cellular automata  for example, to
           simulate Conway's Life CA, `rules = [[3], [2,3]]`,
           i.e. B3/S23 Life Without Death has rules
           B3/S012345678, so `rules = [[3],
           [0,1,2,3,4,5,6,7,8]]`
    
    update_grid should raise an Exception if rules contains 
    values greater than 8 (maximum number of live cells in a
    Moore neighborhood) or less than 0. 
    """

# GENERATED

    # get the dimensions of the grid
    n_rows, n_cols = grid.shape
    
    # get the neighborhood grid
    neighborhood_grid = compute_neighborhood(grid)
    
    # loop over the grid
    for i in range(n_rows):
        for j in range(n_cols):
            # get the current cell value
            cell_val = grid[i, j]
            
            # get the neighborhood value for the current cell
            neighborhood_val = neighborhood_grid[i, j]
            
            # check if the cell is alive
            if cell_val == 1:
                # check if the cell should die
                if neighborhood_val not in rules[1]:
                    grid[i, j] = 0
            else:
                # check if the cell should be born
                if neighborhood_val in rules[0]:
                    grid[i, j] = 1
                    
    return grid

这看起来是一个相当合理的解决问题的方法，尽管我们确实明确地建议，如果邻域网格包含错误的值，就要提出一个异常，而在生成的输出中却找不到它。

仅仅做了一些修正：在get_neighborhood 函数中进行人工干预，并对一些提示进行了几次尝试，我们就成功地提出了一个完全可行的类似生命的细胞自动机模拟器。

这不是一个特别快的实现，但它的质量大致与程序员在开始使用一种新语言时可能做出的 "Hello World "尝试相似，其中许多例子无疑包括在训练数据集中。

我们可以在康威的生命游戏中的进展中直观地看到这个程序的成功。最小的滑翔机在康威的生命游戏中。

虽然我们确实设法在一组函数中编写了一个CA模拟器，但这种方法对于日常的软件工程来说并不是一个非常有用或现实的用例。但这并不妨碍一些初创公司，如来源AI这样的创业公司，本质上是OpenAI Codex API的包装者，他们宣传自己的服务旨在"......让每个人都有机会创造有价值的定制软件。

我们建立了一个独立的系统，可以创造出世界上最熟练的工程师水平的软件"。然而，与Codex互动是学习或练习编程的一种潜在的有用方式，尤其是在CodeSignal、CodeForces或HackerRank等网站上的编码问题的水平。

我们接下来将尝试评估Codex/Copilot在自动编写测试和文档说明的更现实的用例。

任务 2: 编写测试

在这个例子中，我们将转向通过VSCode扩展使用GitHub Copilot。

蟒蛇

# PROMPT (in VSCode)

import numpy as np

def pythagorean(a: np.float, b: np.float) -> np.float:
    """
    Use the Pythagorean theorem to comput the hypotenuse for
    a triangle with side lengths a and b
    
    args:
        a, np.float, side of the triangle
        b, np.float, the other side of the triangle
    
    returns:
        c, np.float, the length of the hypotenuse
    """

    c = np.sqrt(a**2 + b**2)

    return c


def test_pythagorean():

# GENERATED (Copilot autocomplete)
    """
    Test the pythagorean function
    """

    a = np.array([3, 4, 5])
    b = np.array([4, 5, 6])

    c = pythagorean(a, b)

    assert np.allclose(c, np.sqrt(a**2 + b**2))

虽然可能毕达哥拉斯定理函数太简单了，但Copilot建议了一个合理的测试，如果你运行它，它就会通过。自动完成的建议能够使测试的结构和数字内容都正确。

如果我们想用一种更系统的方式来写测试，使用一个首选的框架呢？我们使用numpy和自动分化编写了大量的低级学习模型，所以虽然下一个例子不是100%的真实世界，但也是相当接近的。

在这个例子中，我们将建立一个简单的多层感知器前传、损失函数和梯度函数，使用 autograd和numpy，以及unittest的TestCase类进行测试。

Python

#PROMPT

import unittest

from autograd import numpy as np
from autograd import grad

def forward_mlp(input_x: np.array, \
            weights: list, biases: list) -> np.array:
    """
    compute the forward pass of a multilayer perceptron. 
    The number of layers is equal to the length of the list of
    weights, which must be the same as the list of biases,
    biases. 

    args:
        input_x, np.array, input data
        weights, list of np.arrays, a list of np.array matrices,
             representing the weights
        biases: list of np.arrays, a list of biases for each
             layer
    
    returns:  
        result, np.array, the output of the network
    """

    assert len(weights) == len(biases)

    for layer_index in range(len(weights) - 1):
        input_x = np.tanh(np.matmul(input_x,\
                weights[layer_index]) + biases[layer_index])

    output = np.matmul(input_x, weights[-1]) + biases[-1]

    return output

def get_loss(input_x: np.array, weights: list, \
                biases: list, target: np.array) -> np.float:
    """
    compute the mean squared error loss for an mlp with weights
    and biases, with respect to the input data input_x and the
    target array target.

    args:
        input_x, np.array, input data
        weights, list of np.arrays, a list of np.array matrices,
            representing the weights
        biases: list of np.arrays, a list of biases for each
            layer
        target: np.array, the target values
    
    returns:  
        loss, np.float, the loss
    """

    output = forward_mlp(input_x, weights, biases)

    return np.mean((output - target)**2)

get_loss_grad = grad(get_loss, argnum=(1,2))


class TestMLP(unittest.TestCase):
    """
    tests for MLP, get_loss, and get_grad_loss
    """
    # GENERATED (chosen from Copilot suggestions)
    def setUp(self):
        pass

    def test_forward_mlp(self):
        """
        test forward_mlp
        """

        input_x = np.array([[1, 2, 3], [4, 5, 6]])
        weights = [np.array([[1, 2], [3, 4]]), \
                np.array([[5, 6], [7, 8]]), \
                np.array([[9, 10], [11, 12]])]
        biases = [np.array([[1, 2]]), np.array([[3, 4]]),
                np.array([[5, 6]])]

        output = forward_mlp(input_x, weights, biases)

        self.assertTrue(np.allclose(output, \
                np.array([[-0.9, -0.9], [-0.9, -0.9]])))

    def test_get_loss(self):
        """
        test get_loss
        """

        input_x = np.array([[1, 2, 3], [4, 5, 6]])
        weights = [np.array([[1, 2], [3, 4]]), \
                np.array([[5, 6], [7, 8]]), \
                np.array([[9, 10], [11, 12]])]
        biases = [np.array([[1, 2]]), np.array([[3, 4]]),\
                np.array([[5, 6]])]
        target = np.array([[1, 1], [1, 1]])

        loss = get_loss(input_x, weights, biases, target)

        self.assertTrue(np.allclose(loss, 0.5))
    
    def test_get_loss_grad(self):
        """
        test get_loss_grad
        """

        input_x = np.array([[1, 2, 3], [4, 5, 6]])
        weights = [np.array([[1, 2], [3, 4]]), \
                np.array([[5, 6], [7, 8]]), \
                np.array([[9, 10], [11, 12]])]
        biases = [np.array([[1, 2]]), np.array([[3, 4]]), \ 
                np.array([[5, 6]])]
        target = np.array([[1, 1], [1, 1]])

        loss_grad = get_loss_grad(\
                input_x, weights, biases, target)

        self.assertTrue(\
                np.allclose(loss_grad[0], \
                np.array([[-0.9, -0.9], [-0.9, -0.9]])))
        self.assertTrue(\
                np.allclose(loss_grad[1], \
                np.array([[-0.9, -0.9], [-0.9, -0.9]])))
        self.assertTrue(\
                np.allclose(loss_grad[2],\
                np.array([[-0.9, -0.9], [-0.9, -0.9]])))

# END GENERATED (the final two lines are part of the prompt)
if __name__ == "__main__":
    unittest.main(verbosity=1)

虽然不是很完美，但Copilot的建议确实为一个测试类提供了一个合理的大纲。然而，如果你试图按原样运行代码，没有一个测试会执行，更不用说通过了。

在输入数据和第一个权重矩阵之间存在维度不匹配，数据类型是错误的（所有的数组都是整数dtypes），不能与Autograd的grad函数一起工作。

这些问题并不难解决，如果你将权重矩阵列表中的第一个条目替换为3x2矩阵，正向传递应该可以运行。为了使grad测试顺利进行，你需要在np.array定义中的数字上添加小数点，或者明确定义数组的数据类型。

有了这些改变，测试可以执行并成功失败，但预期的数值并不正确。

任务3：自动文件串

Copilot有一个很有潜力的任务是自动编写文档，特别是为已经写好的函数填写文档说明的形式。这几乎是可行的。

对于毕达哥拉斯定理的例子，它非常接近，但它将问题描述为寻找两点a和b之间的距离，而不是寻找边长a和b的边长c。不令人惊讶的是，Copilot在文档串中想出的例子也不符合函数的实际内容，返回一个标量而不是一个数组的c值。

Copilot对forward MLP函数的文档串的建议也很接近，但不完全正确。

Copilot建议的文件串

机器可以取代我的工作了吗？

对于软件工程师来说，程序合成的每一个新进展都可能引起经济上的恐惧和宣泄上的轻松。

毕竟，如果计算机程序几乎可以像人类计算机程序员一样对计算机进行编程，这不就意味着机器应该 "抢走我们的工作" 不久的某一天?

从表面上看，答案似乎是 "还不能"，但这并不意味着随着这些工具的日益成熟，软件工程的性质可能会保持不变。在未来，用复杂的自动完成工具进行成功的推理可能和使用烧录机一样重要。

Copilot仍在测试测试版中，关于如何使用它的选项有限。Codex，同样，也是通过OpenAI 的API在测试版中提供。试点（呵）项目的使用条款和隐私考虑确实限制了该技术的潜在用例。

根据目前的隐私政策，任何输入这些系统的代码都可能被用于微调模型，并可由GitHub/微软或OpenAI的工作人员审查。这就排除了将Codex或Copilot用于敏感项目。

Copilot确实为它所基于的Codex模型增加了很多效用。你可以为你想要的代码写一个框架或大纲（就像用unittest框架写测试的例子一样），并将光标移到大纲的中间，以获得合理的自动完成建议。

对于任何比简单的编码练习问题更复杂的问题，它不可能建议一个正确的完成，但它通常可以创建一个合理的大纲并节省一些输入。

还应注意的是，Copilot是在云端运行的，这意味着它不能在离线状态下工作，而且自动完成建议的速度也有点慢。你可以通过按alt + ]循环浏览建议，但有时只有几个，甚至一个建议可供选择。

当它运行良好时，它实际上已经好到有点危险了。unittest例子中建议的测试和为Pythagorean函数建议的文档串乍一看是正确的，可能会通过一个疲惫的软件工程师的审查，但当它们包含隐秘的错误时，这只能导致日后的痛苦。

综上所述，虽然Copilot/Codex在其目前的状态下更像是一个玩具或学习工具，但它能正常工作是令人难以置信的。如果你遇到一只会跳华尔兹的熊，令人印象深刻的不是这只熊跳得好，如果你遇到一个智能的代码完成工具，令人印象深刻的也不是它能写出完美的代码。

很可能随着技术的进一步发展，以及使用NLP自动完成工具的人类开发者方面的相当程度的适应，在不久的将来会出现程序合成模型的重大杀手级应用。

计算机数据结构 Docstring NLP 软件工程聚类通过（软件）测试数据类型

经Kevin Vu许可发表于DZone。点击这里查看原文。

DZone贡献者所表达的观点属于他们自己。

用于编写代码的NLP模型。程序合成

编写代码的NLP模型。程序合成

在这篇文章中，我们通过写代码的NLP模型来探索程序的合成，简要介绍了Codex、Copilot和Alphacode。

Copilot, Codex, and AlphaCode:现在为计算机编程的计算机程序有多好？

用深度NLP模型进行现代程序合成

来了一个新的挑战者。AlphaCode

提示性编程。用Codex编写康威的生命游戏

任务 2: 编写测试

任务3：自动文件串

机器可以取代我的工作了吗？

DZone上的热门文章

AI 合作伙伴资源