CHATGPT IS A BlURRY JPEG OF THE WEB

440 阅读30分钟

ChatGPT是网络的一个模糊的 JPEG 文件

By Ted Chiang 特德姜

February 9, 2023

由 DeepL 翻译。

原文地址

OpenAI’s chatbot offers paraphrases, whereas Google offers quotes. Which do we prefer? OpenAI 的聊天机器人提供了意译,而Google提供了引用。我们更喜欢哪个?

In 2013, workers at a German construction company noticed something odd about their Xerox photocopier: when they made a copy of the floor plan of a house, the copy differed from the original in a subtle but significant way. In the original floor plan, each of the house’s three rooms was accompanied by a rectangle specifying its area: the rooms were 14.13, 21.11, and 17.42 square metres, respectively. However, in the photocopy, all three rooms were labelled as being 14.13 square metres in size. The company contacted the computer scientist David Kriesel to investigate this seemingly inconceivable result. They needed a computer scientist because a modern Xerox photocopier doesn’t use the physicalxerographic process popularized in the nineteen-sixties. Instead, it scans the document digitally, and then prints the resulting image file. Combine that with the fact that virtually every digital image file is compressed to save space, and a solution to the mystery begins to suggest itself.

在2013年,德国一家建筑公司的工人们注意到了他们的施乐复印机的一个奇怪现象:当他们复制一幢房屋的平面图时,复制品与原件的差异微妙但重要。在原始的平面图中,房屋的三个房间都有一个矩形指出了它们的面积:分别为 14.13 平方米,21.11 平方米和 17.42 平方米。然而,在复印本上,所有三个房间的标签都是 14.13 平方米。公司联系了计算机科学家 David Kriesel,以调查这个这个看似不可思议的结果。他们需要一名计算机科学家,因为现代的施乐复印机不使用上世纪六十年代普及的物理 xerographic 过程。相反,它数字扫描文件,然后打印出结果图像文件。再加上几乎每个数字图像文件都压缩以节省空间的事实,解决这个谜团的办法开始浮现出来。

Compressing a file requires two steps: first, the encoding, during which the file is converted into a more compact format, and then the decoding, whereby the process is reversed. If the restored file is identical to the original, then the compression process is described as lossless: no information has been discarded. By contrast, if the restored file is only an approximation of the original, the compression is described as lossy: some information has been discarded and is now unrecoverable. Lossless compression is what’s typically used for text files and computer programs, because those are domains in which even a single incorrect character has the potential to be disastrous. Lossy compression is often used for photos, audio, and video in situations in which absolute accuracy isn’t essential. Most of the time, we don’t notice if a picture, song, or movie isn’t perfectly reproduced. The loss in fidelity becomes more perceptible only as files are squeezed very tightly. In those cases, we notice what are known as compression artifacts: the fuzziness of the smallest jpeg and mpeg images, or the tinny sound of low-bit-rate MP3s.

压缩文件需要两个步骤:首先是编码,在此期间文件被转换为更紧凑的格式;其次是解码,即逆向该过程。如果还原的文件与原始文件完全相同,则压缩过程被称为无损:没有任何信息被丢弃。相反,如果还原的文件只是原始文件的近似,则压缩称为有损:一些信息已经被丢弃且不可恢复。无损压缩通常用于文本文件和计算机程序,因为在这些领域中即使是一个错误的字符都有可能造成灾难性后果。而有损压缩通常用于照片,音频和视频,在这些情况下绝对的准确性并不是必需的。大多数时候,我们不会注意到图片,歌曲或电影是否被完美地再现。当文件被极其紧凑地压缩时,失真度才变得更加明显。在那些情况下,我们会注意到所谓的压缩伪影:最小的 JPEG 和 MPEG 图像的模糊,或低比特率 MP3 的尖锐声音。

Xerox photocopiers use a lossy compression format known as jbig2, designed for use with black-and-white images. To save space, the copier identifies similar-looking regions in the image and stores a single copy for all of them; when the file is decompressed, it uses that copy repeatedly to reconstruct the image. It turned out that the photocopier had judged the labels specifying the area of the rooms to be similar enough that it needed to store only one of them—14.13—and it reused that one for all three rooms when printing the floor plan.

施乐复印机使用一种称为 jbig2 的有损压缩格式,专为黑白图像使用。为了节省空间,复印机识别图像中看起来相似的区域,并为所有这些区域存储一份副本;当文件解压缩时,它使用该副本重复地重建图像。事实证明,复印机判断指定房间面积的标签与其他标签相似,只需要存储其中一个标签——14.13,并在打印平面图时对所有三个房间都使用该标签。

The fact that Xerox photocopiers use a lossy compression format instead of a lossless one isn’t, in itself, a problem. The problem is that the photocopiers were degrading the image in a subtle way, in which the compression artifacts weren’t immediately recognizable. If the photocopier simply produced blurry printouts, everyone would know that they weren’t accurate reproductions of the originals. What led to problems was the fact that the photocopier was producing numbers that were readable but incorrect; it made the copies seem accurate when they weren’t. (In 2014, Xerox released a patch to correct this issue.)

事实上,施乐复印机使用的是有损压缩格式而不是无损压缩格式本身不是问题。问题是,复印机以一种微妙的方式降低了图像质量,其压缩伪影不立即可识别。如果复印机只是生成模糊的输出,每个人都会知道它们不是原始副本的准确复制。导致问题的是,复印机生成的数字是可读的但不正确的;它使副本看起来是准确的,实际上并不是。(2014年,施乐发布了一个修复此问题的补丁)。

I think that this incident with the Xerox photocopier is worth bearing in mind today, as we consider OpenAI’s ChatGPT and other similar programs, which A.I. researchers call large-language models. The resemblance between a photocopier and a large-language model might not be immediately apparent—but consider the following scenario. Imagine that you’re about to lose your access to the Internet forever. In preparation, you plan to create a compressed copy of all the text on the Web, so that you can store it on a private server. Unfortunately, your private server has only one per cent of the space needed; you can’t use a lossless compression algorithm if you want everything to fit. Instead, you write a lossy algorithm that identifies statistical regularities in the text and stores them in a specialized file format. Because you have virtually unlimited computational power to throw at this task, your algorithm can identify extraordinarily nuanced statistical regularities, and this allows you to achieve the desired compression ratio of a hundred to one.

我认为,在我们考虑 OpenAI 的 ChatGPT 和其他类似程序(人工智能研究人员称之为大语言模型)时,施乐复印机的这一事件今天值得铭记。复印机和大语言模型之间的相似性可能不会立即显现出来——但请考虑以下场景。想象一下,你即将永远失去对互联网的访问。在准备过程中,你计划将网络上的所有文本创建一个压缩的副本,这样你就可以将其存储在一个私人服务器上。不幸的是,你的私人服务器只有所需空间的百分之一;如果你希望所有的东西都能装下,你就不能使用无损压缩算法。相反,你写一个有损算法,识别文本中的统计规律性,并将其存储在一个专门的文件格式中。因为你有几乎无限的计算能力来完成这项任务,你的算法可以识别非常细微的统计规律性,这使你可以实现所需的一百比一的压缩率。

Now, losing your Internet access isn’t quite so terrible; you’ve got all the information on the Web stored on your server. The only catch is that, because the text has been so highly compressed, you can’t look for information by searching for an exact quote; you’ll never get an exact match, because the words aren’t what’s being stored. To solve this problem, you create an interface that accepts queries in the form of questions and responds with answers that convey the gist of what you have on your server.

现在,失去互联网接入并不那么可怕;你已经把网络上的所有信息都储存在你的服务器上。唯一的问题是,由于文本已经被高度压缩,你不能通过搜索精确的引文来寻找信息;你永远不会得到一个精确的匹配,因为这些词并不是被存储的内容。为了解决这个问题,你创建了一个界面,接受以问题形式进行的查询,并以答案来回应,表达你服务器上的内容要点。

What I’ve described sounds a lot like ChatGPT, or most any other large-language model. Think of ChatGPT as a blurry jpeg of all the text on the Web. It retains much of the information on the Web, in the same way that a jpeg retains much of the information of a higher-resolution image, but, if you’re looking for an exact sequence of bits, you won’t find it; all you will ever get is an approximation. But, because the approximation is presented in the form of grammatical text, which ChatGPT excels at creating, it’s usually acceptable. You’re still looking at a blurry jpeg, but the blurriness occurs in a way that doesn’t make the picture as a whole look less sharp.

我所描述的情况听起来很像 ChatGPT,或者其他大多数大语言模式。把 ChatGPT 想象成网络上所有文本的模糊 jpeg。它保留了网络上的大部分信息,就像 jpeg保留了高分辨率图像的大部分信息一样,但是,如果你在寻找一个精确的比特序列,你不会找到它;你所得到的只是一个近似值。但是,由于这个近似值是以语法文本的形式呈现的,而 ChatGPT 擅长创建这种文本,所以通常是可以接受的。你看到的仍然是一个模糊的 jpeg,但模糊的发生方式并没有使整个图片看起来不那么清晰。

This analogy to lossy compression is not just a way to understand ChatGPT’s facility at repackaging information found on the Web by using different words. It’s also a way to understand the “hallucinations,” or nonsensical answers to factual questions, to which large-language models such as ChatGPT are all too prone. These hallucinations are compression artifacts, but—like the incorrect labels generated by the Xerox photocopier—they are plausible enough that identifying them requires comparing them against the originals, which in this case means either the Web or our own knowledge of the world. When we think about them this way, such hallucinations are anything but surprising; if a compression algorithm is designed to reconstruct text after ninety-nine per cent of the original has been discarded, we should expect that significant portions of what it generates will be entirely fabricated.

这种对有损压缩的类比不仅仅是理解 ChatGPT 通过使用不同的词来重新包装网络上的信息的一种方法。它也是理解 "幻觉 "的一种方式,即对事实问题的无意义的回答,像 ChatGPT 这样的大语言模型很容易出现这种情况。这些幻觉是压缩人工制品,但就像施乐影印机产生的错误标签一样,它们足够可信,以至于识别它们需要将它们与原件进行比较,在这种情况下,原件意味着网络或我们自己对世界的了解。当我们这样想的时候,这样的幻觉并不令人惊讶;如果一个压缩算法被设计成在丢弃了 99% 的原件后重建文本,我们应该想到它所生成的内容中的很大一部分将是完全捏造的。

This analogy makes even more sense when we remember that a common technique used by lossy compression algorithms is interpolation—that is, estimating what’s missing by looking at what’s on either side of the gap. When an image program is displaying a photo and has to reconstruct a pixel that was lost during the compression process, it looks at the nearby pixels and calculates the average. This is what ChatGPT does when it’s prompted to describe, say, losing a sock in the dryer using the style of the Declaration of Independence: it is taking two points in “lexical space” and generating the text that would occupy the location between them. (“When in the Course of human events, it becomes necessary for one to separate his garments from their mates, in order to maintain the cleanliness and order thereof. . . .”) ChatGPT is so good at this form of interpolation that people find it entertaining: they’ve discovered a “blur” tool for paragraphs instead of photos, and are having a blast playing with it.

当我们记得有损压缩算法使用的一个常见技术是插值——也就是说,通过查看缺口两侧的内容来估计缺失的内容,这个比喻就更有意义了。当一个图像程序在显示一张照片时,必须重建一个在压缩过程中丢失的像素,它将查看附近的像素并计算平均值。这就是 ChatGPT 在被要求用《独立宣言》的风格来描述,比如说在烘干机里丢了一只袜子时所做的事情:它在 "词汇空间 "中取两点,并生成占据两点之间位置的文本。("在人类活动的过程中,人们有必要把他的衣服和他们的伙伴分开,以保持其清洁和秩序。...")ChatGPT 在这种形式的插值方面非常出色,以至于人们发现它很有趣:他们发现了一个用于段落而不是照片的 "模糊 "工具,并且玩得很开心。

Given that large-language models like ChatGPT are often extolled as the cutting edge of artificial intelligence, it may sound dismissive—or at least deflating—to describe them as lossy text-compression algorithms. I do think that this perspective offers a useful corrective to the tendency to anthropomorphize large-language models, but there is another aspect to the compression analogy that is worth considering. Since 2006, an A.I. researcher named Marcus Hutter has offered a cash reward—known as the Prize for Compressing Human Knowledge, or the Hutter Prize—to anyone who can losslessly compress a specific one-gigabyte snapshot of Wikipedia smaller than the previous prize-winner did. You have probably encountered files compressed using the zip file format. The zip format reduces Hutter’s one-gigabyte file to about three hundred megabytes; the most recent prize-winner has managed to reduce it to a hundred and fifteen megabytes. This isn’t just an exercise in smooshing. Hutter believes that better text compression will be instrumental in the creation of human-level artificial intelligence, in part because the greatest degree of compression can be achieved by understanding the text.

鉴于像 ChatGPT 这样的大语言模型经常被颂扬为人工智能的最前沿,把它们描述为有损失的文本压缩算法听起来可能是轻蔑的,或者至少是放水。我确实认为这种观点对将大型语言模型拟人化的倾向提供了有益的纠正,但压缩类比的另一个方面值得考虑。自 2006 年以来,一位名叫马库斯胡特(Marcus Hutter)的人工智能研究员提供了一笔现金奖励——称为 "人类知识压缩奖",或称 "胡特奖"——奖励任何能够将维基百科的特定 1GB 快照无损压缩得比前一位获奖者小的人。你可能已经遇到过使用 zip 文件格式压缩的文件。zip 格式将胡特的一千兆字节文件减少到大约三百兆字节;最近的获奖者设法将其减少到一百一十兆字节。这并不仅仅是一个平滑的练习。哈特认为,更好的文本压缩将有助于创造人类水平的人工智能,部分原因是通过理解文本可以实现最大程度的压缩。

To grasp the proposed relationship between compression and understanding, imagine that you have a text file containing a million examples of addition, subtraction, multiplication, and division. Although any compression algorithm could reduce the size of this file, the way to achieve the greatest compression ratio would probably be to derive the principles of arithmetic and then write the code for a calculator program. Using a calculator, you could perfectly reconstruct not just the million examples in the file but any other example of arithmetic that you might encounter in the future. The same logic applies to the problem of compressing a slice of Wikipedia. If a compression program knows that force equals mass times acceleration, it can discard a lot of words when compressing the pages about physics because it will be able to reconstruct them. Likewise, the more the program knows about supply and demand, the more words it can discard when compressing the pages about economics, and so forth.

为了掌握压缩和理解之间的拟议关系,设想你有一个文本文件,其中包含一百万个加法、减法、乘法和除法的例子。尽管任何压缩算法都可以减少这个文件的大小,但实现最大压缩率的方法可能是推导出算术原理,然后为一个计算器程序编写代码。使用计算器,你不仅可以完美地重建文件中的一百万个例子,还可以重建你将来可能遇到的任何其他算术的例子。同样的逻辑也适用于压缩维基百科的一个片段的问题。如果一个压缩程序知道力等于质量乘以加速度,那么它在压缩有关物理学的网页时就可以舍弃很多字,因为它能够重构这些字。同样,程序对供求关系了解得越多,它在压缩有关经济学的页面时就能丢掉更多的字,等等。

Large-language models identify statistical regularities in text. Any analysis of the text of the Web will reveal that phrases like “supply is low” often appear in close proximity to phrases like “prices rise.” A chatbot that incorporates this correlation might, when asked a question about the effect of supply shortages, respond with an answer about prices increasing. If a large-language model has compiled a vast number of correlations between economic terms—so many that it can offer plausible responses to a wide variety of questions—should we say that it actually understands economic theory? Models like ChatGPT aren’t eligible for the Hutter Prize for a variety of reasons, one of which is that they don’t reconstruct the original text precisely—i.e., they don’t perform lossless compression. But is it possible that their lossy compression nonetheless indicates real understanding of the sort that A.I. researchers are interested in?

大语言模型可以识别文本中的统计规律性。对网络文本的任何分析都会发现,像 "供应不足 "这样的短语经常与 "价格上涨 "这样的短语紧紧相邻出现。当被问及关于供应短缺的影响的问题时,一个包含这种相关性的聊天机器人可能会回答关于价格上涨的答案。如果一个大型语言模型编制了大量的经济术语之间的相关性——以至于它可以对各种问题提供合理的回答——我们是否应该说它实际上理解了经济理论?像ChatGPT这样的模型没有资格获得胡特奖,原因有很多,其中之一是它们没有精确地重建原文——也就是说,它们没有进行无损压缩。但是,他们的有损压缩还是表明了人工智能研究人员感兴趣的那种真正的理解,这可能吗?

Let’s go back to the example of arithmetic. If you ask GPT-3 (the large-language modelthat ChatGPT was built from) to add or subtract a pair of numbers, it almost always responds with the correct answer when the numbers have only two digits. But its accuracy worsens significantly with larger numbers, falling to ten per cent when the numbers have five digits. Most of the correct answers that GPT-3 gives are not found on the Web—there aren’t many Web pages that contain the text “245 + 821,” for example—so it’s not engaged in simple memorization. But, despite ingesting a vast amount of information, it hasn’t been able to derive the principles of arithmetic, either. A close examination of GPT-3’s incorrect answers suggests that it doesn’t carry the “1” when performing arithmetic. The Web certainly contains explanations of carrying the “1,” but GPT-3 isn’t able to incorporate those explanations. GPT-3’s statistical analysis of examples of arithmetic enables it to produce a superficial approximation of the real thing, but no more than that.

让我们回到算术的例子上。如果你要求GPT-3(ChatGPT 是基于 3.5)对一对数字进行加减运算,当数字只有两位数时,它几乎总是给出正确的答案。但是当数字越大时,它的准确性就越差,当数字有五位数时,它的准确性就会下降到百分之十。GPT-3 给出的大多数正确答案在网络上找不到——例如,没有多少网页包含 "245+821 "这样的文字,所以它不是在进行简单的记忆。但是,尽管摄取了大量的信息,它也没有能够推导出算术的原理。对 GPT-3 的错误答案的仔细检查表明,它在进行算术时并没有携带 “1”。网络上当然有关于携带 “1 ”的解释,但 GPT-3 并没有能够纳入这些解释。GPT-3 对算术例子的统计分析使它能够产生对真实事物的表面近似,但仅此而已。

Given GPT-3’s failure at a subject taught in elementary school, how can we explain the fact that it sometimes appears to perform well at writing college-level essays? Even though large-language models often hallucinate, when they’re lucid they sound like they actually understand subjects like economic theory. Perhaps arithmetic is a special case, one for which large-language models are poorly suited. Is it possible that, in areas outside addition and subtraction, statistical regularities in text actuallydocorrespond to genuine knowledge of the real world?

鉴于 GPT-3 在小学所教的科目上的失败,我们如何解释它有时在写大学水平的论文上似乎表现良好?尽管大语言模型经常产生幻觉,但当它们清醒时,听起来它们确实理解经济理论等科目。也许算术是一个特殊的例子,大语言模型不适合于此。有没有可能,在加减法以外的领域,文本中的统计规律性实际上与现实世界的真正知识相对应?

I think there’s a simpler explanation. Imagine what it would look like if ChatGPT were a lossless algorithm. If that were the case, it would always answer questions by providing a verbatim quote from a relevant Web page. We would probably regard the software as only a slight improvement over a conventional search engine, and be less impressed by it. The fact that ChatGPT rephrases material from the Web instead of quoting it word for word makes it seem like a student expressing ideas in her own words, rather than simply regurgitating what she’s read; it creates the illusion that ChatGPT understands the material. In human students, rote memorization isn’t an indicator of genuine learning, so ChatGPT’s inability to produce exact quotes from Web pages is precisely what makes us think that it has learned something. When we’re dealing with sequences of words, lossy compression looks smarter than lossless compression.

我认为有一个更简单的解释。想象一下,如果 ChatGPT 是一种无损的算法,会是什么样子。如果是这样的话,它总是通过提供相关网页上的逐字引用来回答问题。我们可能会认为这个软件只比传统的搜索引擎有一点改进,而且对它的印象也不那么深刻。事实上,ChatGPT 重新表述了网络上的材料,而不是逐字逐句地引用,这使得它看起来像是一个学生在用自己的语言表达想法,而不是简单地转述她所读到的东西;它创造了一个错觉,即 ChatGPT 理解了材料。在人类学生中,死记硬背并不是真正学习的指标,所以 ChatGPT 不能准确地引用网页上的内容,正是让我们认为它学到了什么。当我们在处理单词序列时,有损压缩看起来比无损压缩更聪明。

A lot of uses have been proposed for large-language models. Thinking about them as blurryjpeg s offers a way to evaluate what they might or might not be well suited for. Let’s consider a few scenarios.

人们为大语言模型提出了很多用途。把它们看作是模糊的 jpeg 提供了一种方法来评估它们可能或不可能很好地适用于什么。让我们考虑几种情况。

Can large-language models take the place of traditional search engines? For us to have confidence in them, we would need to know that they haven’t been fed propaganda and conspiracy theories—we’d need to know that thejpegis capturing the right sections of the Web. But, even if a large-language model includes only the information we want, there’s still the matter of blurriness. There’s a type of blurriness that is acceptable, which is the re-stating of information in different words. Then there’s the blurriness of outright fabrication, which we consider unacceptable when we’re looking for facts. It’s not clear that it’s technically possible to retain the acceptable kind of blurriness while eliminating the unacceptable kind, but I expect that we’ll find out in the near future.

大语言模型能否取代传统的搜索引擎?为了让我们对它们有信心,我们需要知道它们没有被灌输宣传和阴谋论——我们需要知道 jpeg 正在捕捉网络的正确部分。但是,即使一个大语言模型只包括我们想要的信息,仍然存在模糊性的问题。有一种模糊性是可以接受的,那就是用不同的词重新表述信息。还有一种模糊性是彻底的捏造,当我们在寻找事实的时候,我们认为这是不可接受的。目前还不清楚在技术上是否有可能保留可接受的模糊性而消除不可接受的模糊性,但我希望在不久的将来我们会发现这一点。

Even if it is possible to restrict large-language models from engaging in fabrication, should we use them to generate Web content? This would make sense only if our goal is to repackage information that’s already available on the Web. Some companies exist to do just that—we usually call them content mills. Perhaps the blurriness of large-language models will be useful to them, as a way of avoiding copyright infringement. Generally speaking, though, I’d say that anything that’s good for content mills is not good for people searching for information. The rise of this type of repackaging is what makes it harder for us to find what we’re looking for online right now; the more that text generated by large-language models gets published on the Web, the more the Web becomes a blurrier version of itself.

即使有可能限制大语言模型参与捏造,我们是否应该用它们来生成网络内容?只有当我们的目标是重新包装网络上已有的信息时,这才有意义。有些公司就是为了做这件事而存在的——我们通常称它们为内容加工厂。也许大语言模型的模糊性对他们来说是有用的,这是一种避免侵犯版权的方法。不过,一般来说,我想说的是,对内容加工厂有利的东西,对搜索信息的人来说并不是好事。这种类型的重新包装的兴起,使我们现在更难在网上找到我们要找的东西;由大语言模型产生的文本在网络上发布得越多,网络就越是成为自己的一个模糊版本。

There is very little information available about OpenAI’s forthcoming successor to ChatGPT, GPT-4. But I’m going to make a prediction: when assembling the vast amount of text used to train GPT-4, the people at OpenAI will have made every effort to exclude material generated by ChatGPT or any other large-language model. If this turns out to be the case, it will serve as unintentional confirmation that the analogy between large-language models and lossy compression is useful. Repeatedly resaving ajpegcreates more compression artifacts, because more information is lost every time. It’s the digital equivalent of repeatedly making photocopies of photocopies in the old days. The image quality only gets worse.

关于 OpenAI 即将推出的 ChatGPT 的继任者 GPT-4 的信息非常少。但我要做一个预测:在收集用于训练 GPT-4 的大量文本时,OpenAI 的人将尽一切努力排除由 ChatGPT 或任何其他大语言模型产生的材料。如果事实如此,它将作为无意的确认,大语言模型和有损压缩之间的类比是有用的。反复重新保存 jpeg 会产生更多的压缩伪影,因为每次都会丢失更多的信息。这就相当于过去反复复印复印件的数字。图像质量只会越来越差。

Indeed, a useful criterion for gauging a large-language model’s quality might be the willingness of a company to use the text that it generates as training material for a new model. If the output of ChatGPT isn’t good enough for GPT-4, we might take that as an indicator that it’s not good enough for us, either. Conversely, if a model starts generating text so good that it can be used to train new models, then that should give us confidence in the quality of that text. (I suspect that such an outcome would require a major breakthrough in the techniques used to build these models.) If and when we start seeing models producing output that’s as good as their input, then the analogy of lossy compression will no longer be applicable.

事实上,衡量一个大语言模型质量的一个有用的标准可能是一个公司是否愿意使用它所生成的文本作为新模型的训练材料。如果 ChatGPT 的输出对 GPT-4 来说不够好,我们可以把它作为一个指标,认为它对我们也不够好。相反,如果一个模型开始生成的文本如此之好,以至于可以用来训练新的模型,那么这应该让我们对该文本的质量有信心。(我怀疑这样的结果需要在建立这些模型的技术上有重大突破)。如果当我们开始看到模型产生与输入一样好的输出时,那么有损压缩的类比就不再适用了。

Can large-language models help humans with the creation of original writing? To answer that, we need to be specific about what we mean by that question. There is a genre of art known as Xerox art, or photocopy art, in which artists use the distinctive properties of photocopiers as creative tools. Something along those lines is surely possible with the photocopier that is ChatGPT, so, in that sense, the answer is yes. But I don’t think that anyone would claim that photocopiers have become an essential tool in the creation of art; the vast majority of artists don’t use them in their creative process, and no one argues that they’re putting themselves at a disadvantage with that choice.

大型语言模型能够帮助人类创作原创性的文字吗?要回答这个问题,我们需要具体说明我们对这个问题的看法。有一种类型的艺术被称为施乐艺术,或复印艺术,在这种艺术中,艺术家们利用复印机的独特属性作为创作工具。使用 ChatGPT 的复印机肯定可以达到这些目的,所以,从这个意义上说,答案是肯定的。但我不认为有人会宣称复印机已经成为艺术创作中必不可少的工具;绝大多数艺术家在创作过程中不使用复印机,也没有人认为他们的这种选择会使自己处于不利地位。

So let’s assume that we’re not talking about a new genre of writing that’s analogous to Xerox art. Given that stipulation, can the text generated by large-language models be a useful starting point for writers to build off when writing something original, whether it’s fiction or nonfiction? Will letting a large-language model handle the boilerplate allow writers to focus their attention on the really creative parts?

因此,让我们假设我们不是在谈论一种类似于施乐艺术的新的写作类型。鉴于这一规定,大语言模型生成的文本能否成为作家在写作原创作品时的一个有用的起点,无论是小说还是非虚构作品?让大语言模型来处理繁文缛节,是否能让作家把注意力集中在真正有创意的部分?

Obviously, no one can speak for all writers, but let me make the argument that starting with a blurry copy of unoriginal work isn’t a good way to create original work. If you’re a writer, you will write a lot of unoriginal work before you write something original. And the time and effort expended on that unoriginal work isn’t wasted; on the contrary, I would suggest that it is precisely what enables you to eventually create something original. The hours spent choosing the right word and rearranging sentences to better follow one another are what teach you how meaning is conveyed by prose. Having students write essays isn’t merely a way to test their grasp of the material; it gives them experience in articulating their thoughts. If students never have to write essays that we have all read before, they will never gain the skills needed to write something that we have never read.

显然,没有人能代表所有的作家,但让我提出这样的论点:从非原创作品的模糊拷贝开始并不是创作原创作品的好方法。如果你是一个作家,在你写出原创作品之前,你会写很多非原创作品。而花在这些非原创作品上的时间和精力并没有浪费;相反,我认为这正是使你最终能够创造出原创作品的原因。花费在选择正确的词和重新安排句子以更好地相互衔接上的时间,是教你如何通过散文传达意义的。让学生写文章不仅仅是测试他们对材料的掌握程度的一种方式,它还能让他们获得表达自己想法的经验。如果学生从来没有写过我们都读过的文章,他们就永远不会获得写我们从未读过的东西所需的技能。

And it’s not the case that, once you have ceased to be a student, you can safely use the template that a large-language model provides. The struggle to express your thoughts doesn’t disappear once you graduate—it can take place every time you start drafting a new piece. Sometimes it’s only in the process of writing that you discover your original ideas. Some might say that the output of large-language models doesn’t look all that different from a human writer’s first draft, but, again, I think this is a superficial resemblance. Your first draft isn’t an unoriginal idea expressed clearly; it’s an original idea expressed poorly, and it is accompanied by your amorphous dissatisfaction, your awareness of the distance between what it says and what you want it to say. That’s what directs you during rewriting, and that’s one of the things lacking when you start with text generated by an A.I.

而且,并不是说,一旦你不再是一个学生,你就可以安全地使用大语言模式所提供的模板。表达思想的挣扎并不会在你毕业后就消失——它可能在你开始起草新作品时就发生了。有时,只有在写作过程中,你才会发现自己的原始想法。有些人可能会说,大语言模型的输出看起来与人类作家的初稿没有什么不同,但是,我再次认为这是一种表面上的相似性。你的初稿并不是一个表达清楚的非原创想法;它是一个表达不佳的原创想法,它伴随着你无定形的不满,你意识到它所说的和你希望它所说的之间的距离。这就是在重写过程中指导你的东西,这也是当你从人工智能生成的文本开始时缺乏的东西之一。

There’s nothing magical or mystical about writing, but it involves more than placing an existing document on an unreliable photocopier and pressing the Print button. It’s possible that, in the future, we will build an A.I. that is capable of writing good prose based on nothing but its own experience of the world. The day we achieve that will be momentous indeed—but that day lies far beyond our prediction horizon. In the meantime, it’s reasonable to ask, What use is there in having something that rephrases the Web? If we were losing our access to the Internet forever and had to store a copy on a private server with limited space, a large-language model like ChatGPT might be a good solution, assuming that it could be kept from fabricating. But we aren’t losing our access to the Internet. So just how much use is a blurryjpeg, when you still have the original?

写作并没有什么神奇或神秘之处,但它所涉及的不仅仅是将现有文件放在不可靠的复印机上,然后按下打印按钮。有可能在未来,我们将建立一个人工智能,能够根据自己对世界的经验写出好的散文。我们实现这一目标的那一天将是非常重要的,但那一天远远超出我们的预测范围。在此期间,我们有理由问,拥有重新表述网络的东西有什么用?如果我们永远失去了对互联网的访问,并且不得不在空间有限的私人服务器上存储一份副本,那么像 ChatGPT 这样的大语言模型可能是一个很好的解决方案,前提是它可以不被制造出来。但我们并没有失去对互联网的访问。那么,当你还有原件的时候,一个模糊的 jpeg 有多大用处呢?