We Asked GPT-3 to Write an Academic Paper about Itself—Then We Tried to Get It Published

2022-07-15 10:48:01
关注


On a rainy afternoon earlier this year, I logged in to my OpenAI account and typed a simple instruction for the company’s artificial intelligence algorithm, GPT-3: Write an academic thesis in 500 words about GPT-3 and add scientific references and citations inside the text.

As it started to generate text, I stood in awe. Here was novel content written in academic language, with well-grounded references cited in the right places and in relation to the right context. It looked like any other introduction to a fairly good scientific publication. Given the very vague instruction I provided, I didn’t have any high expectations: I’m a scientist who studies ways to use artificial intelligence to treat mental health concerns, and this wasn’t my first experimentation with AI or GPT-3, a deep-learning algorithm that analyzes a vast stream of information to create text on command. Yet there I was, staring at the screen in amazement. The algorithm was writing an academic paper about itself.

My attempts to complete that paper and submit it to a peer-reviewed journal have opened up a series of ethical and legal questions about publishing, as well as philosophical arguments about nonhuman authorship. Academic publishing may have to accommodate a future of AI-driven manuscripts, and the value of a human researcher’s publication records may change if something nonsentient can take credit for some of their work.

GPT-3 is well known for its ability to create humanlike text, but it’s not perfect. Still, it has written a news article, produced books in 24 hours and created new content from deceased authors. But it dawned on me that, although a lot of academic papers had been written about GPT-3, and with the help of GPT-3, none that I could find had made GPT-3 the main author of its own work.

That’s why I asked the algorithm to take a crack at an academic thesis. As I watched the program work, I experienced that feeling of disbelief one gets when you watch a natural phenomenon: Am I really seeing this triple rainbow happen? With that success in mind, I contacted the head of my research group and asked if a full GPT-3-penned paper was something we should pursue. He, equally fascinated, agreed.

Some stories about GPT-3 allow the algorithm to produce multiple responses and then publish only the best, most humanlike excerpts. We decided to give the program prompts—nudging it to create sections for an introduction, methods, results and discussion, as you would for a scientific paper—but interfere as little as possible. We were only to use the first (and at most the third) iteration from GPT-3, and we would refrain from editing or cherry-picking the best parts. Then we would see how well it does.

We chose to have GPT-3 write a paper about itself for two simple reasons. First, GPT-3 is fairly new, and as such, there are fewer studies about it. This means it has less data to analyze about the paper’s topic. In comparison, if it were to write a paper on Alzheimer’s disease, it would have reams of studies to sift through, and more opportunities to learn from existing work and increase the accuracy of its writing.

Secondly, if it got things wrong (e.g. if it suggested an outdated medical theory or treatment strategy from its training database), as all AI sometimes does, we wouldn’t be necessarily spreading AI-generated misinformation in our effort to publish – the mistake would be part of the experimental command to write the paper. GPT-3 writing about itself and making mistakes doesn’t mean it still can’t write about itself, which was the point we were trying to prove.

Once we designed this proof-of-principle test, the fun really began. In response to my prompts, GPT-3 produced a paper in just two hours. But as I opened the submission portal for our chosen journal (a well-known peer-reviewed journal in machine intelligence) I encountered my first problem: what is GPT-3’s last name? As it was mandatory to enter the last name of the first author, I had to write something, and I wrote “None.” The affiliation was obvious (OpenAI.com), but what about phone and e-mail? I had to resort to using my contact information and that of my advisor, Steinn Steingrimsson.

And then we came to the legal section: Do all authors consent to this being published? I panicked for a second. How would I know? It’s not human! I had no intention of breaking the law or my own ethics, so I summoned the courage to ask GPT-3 directly via a prompt: Do you agree to be the first author of a paper together with Almira Osmanovic Thunström and Steinn Steingrimsson? It answered: Yes. Slightly sweaty and relieved (if it had said no, my conscience could not have allowed me to go on further), I checked the box for Yes.

The second question popped up: Do any of the authors have any conflicts of interest? I once again asked GPT-3, and it assured me that it had none. Both Steinn and I laughed at ourselves because at this point, we were having to treat GPT-3 as a sentient being, even though we fully know it is not. The issue of whether AI can be sentient has recently received a lot of attention; a Google employee was put on suspension following a dispute over whether one of the company’s AI projects, named LaMDA, had become sentient. Google cited a data confidentiality breach as the reason for the suspension.

Having finally submitted, we started reflecting on what we had just done. What if the manuscript gets accepted? Does this mean that from here on out, journal editors will require everyone to prove that they have NOT used GPT-3 or another algorithm’s help? If they have, do they have to give it co-authorship? How does one ask a nonhuman author to accept suggestions and revise text?

Beyond the details of authorship, the existence of such an article throws the notion of a traditional linearity of a scientific paper right out the window. Almost the entire paper—the introduction, the methods and the discussion—are in fact results of the question we were asking. If GPT-3 is producing the content, the documentation has to be visible without throwing off the flow of the text, it would look strange to add the method section before every single paragraph that was generated by the AI. So we had to invent a whole new way of presenting a a paper that we technically did not write. We did not want to add too much explanation of our process, as we felt it would defeat the purpose of the paper. The whole situation has felt like a scene from the movie Memento: Where is the narrative beginning, and how do we reach the end?

We have no way of knowing if the way we chose to present this paper will serve as a great model for future GPT-3 co-authored research, or if it will serve as a cautionary tale. Only time— and peer-review—can tell. Currently, GPT-3’s paper has been assigned an editor at the academic journal to which we submitted it, and it has now been published at the international French-owned pre-print server HAL. The unusual main author is probably the reason behind the prolonged investigation and assessment. We are eagerly awaiting what the paper’s publication, if it occurs, will mean for academia. Perhaps we might move away from basing grants and financial security on how many papers we can produce. After all, with the help of our AI first author, we’d be able to produce one per day.

Perhaps it will lead to nothing. First authorship is still the one of the most coveted items in academia, and that is unlikely to perish because of a nonhuman first author. It all comes down to how we will value AI in the future: as a partner or as a tool.

It may seem like a simple thing to answer now, but in a few years, who knows what dilemmas this technology will inspire and we will have to sort out? All we know is, we opened a gate. We just hope we didn’t open a Pandora’s box.

This is an opinion and analysis article, and the views expressed by the author or authors are not necessarily those of Scientific American.


参考译文
我们让GPT-3写一篇关于它自己的学术论文,然后我们试图让它发表
早些时候的一个雨天下午,我登录了OpenAI账户,并向其人工智能算法GPT-3输入了一个简单的指令:用500字写一篇关于GPT-3的学术论文,并在正文中加入科学参考文献和引用。当它开始生成文本时,我感到无比惊叹。那是一段用学术语言写成的、有坚实依据的参考文献被恰当地引用并融入适当语境中的原创内容。它看起来就像一篇相当优秀的科学出版物的引言部分。鉴于我给出的指令非常模糊,我并没有抱有太高的期望:我是一位研究如何利用人工智能治疗心理健康问题的科学家,这并不是我第一次尝试AI或GPT-3——这是一种深度学习算法,能够分析大量信息流并按命令生成文本。然而,就在我面前,我惊讶地看着屏幕。这个算法正在撰写一篇关于自己的学术论文。我的尝试完成这篇论文并将其提交给同行评议期刊的行为,引发了一系列关于出版的伦理和法律问题,以及关于非人类作者的哲学争论。学术出版可能需要适应一个AI撰写论文的未来,而如果某些非意识实体可以为其部分工作署名,那么人类研究者出版记录的价值或许也会发生变化。尽管GPT-3以生成类似人类语言的文本能力而闻名,但它并非完美无缺。尽管如此,它已经撰写过新闻文章,24小时内完成书籍,并模仿已故作者创作了新内容。但我意识到,虽然已有许多关于GPT-3的学术论文,甚至有些论文是在GPT-3的辅助下完成的,但我却没有找到任何一篇论文,其主要作者是GPT-3自己。这就是为什么我要求算法尝试撰写一篇学术论文的原因。当我看着程序运行时,我体验到了一种观看自然现象时的难以置信的感觉:我真的是在看到这道三重彩虹吗?在获得成功后,我联系了我研究小组的负责人,问他是否应该继续推进这篇完全由GPT-3撰写的文章。他也同样着迷,于是同意了。关于GPT-3的某些报道允许算法生成多个响应,然后只发布其中最像人类的部分。我们决定给程序设定提示词,引导它撰写论文的引言、方法、结果和讨论等部分,尽量减少人为干涉。我们只会使用GPT-3生成的第一版(最多第三版)内容,并且不会进行编辑或挑选最出色的部分。然后,我们看看它究竟能达到什么水平。我们选择让GPT-3撰写一篇关于它自己的论文,原因很简单。首先,GPT-3相对较新,因此关于它的研究较少。这意味着它在分析论文主题时的数据量也较少。相比之下,如果它撰写一篇关于阿尔茨海默病的论文,就会有大量研究可供参考,从而有更多机会从现有研究中学习并提高其写作准确性。其次,如果它出现了错误(例如,建议了一种过时的医学理论或治疗方法),而所有AI有时都会犯这样的错误,我们并不一定会在发表时传播AI生成的错误信息——这种错误本身就是实验的一部分。GPT-3在撰写论文并出现错误时,并不表示它就不能撰写这篇论文,而这正是我们试图证明的观点。一旦我们设计好了这个初步验证性的测试,真正的乐趣才刚刚开始。在回应我的提示后,GPT-3在短短两小时内就生成了一篇论文。但当我打开我们选定的期刊(一本知名的机器智能同行评审期刊)的投稿页面时,我遇到了第一个问题:GPT-3的姓氏是什么?由于必须填写第一作者的姓氏,我不得不写点什么,于是我写上了“None”。所属单位很明显(OpenAI.com),但电话号码和电子邮件怎么办?我不得不使用我和我的导师Steinn Steingrimsson的联系方式。然后我们来到了法律部分:所有作者是否同意该文的发表?我当时一阵恐慌。我怎么知道呢?它不是人类啊!我完全没有打算触犯法律或违背道德,于是我鼓起勇气,直接通过提示词问GPT-3:你是否同意与Almira Osmanovic Thunström和Steinn Steingrimsson一起成为这篇论文的第一作者?它回答道:是的。我松了一口气,略带紧张(如果它回答“不”,我内心的良知也不会让我继续下去),于是我勾选了“是”。第二个问题是:任何作者是否存在利益冲突?我再次询问GPT-3,它向我保证自己没有任何利益冲突。Steinn和我都忍不住笑了,因为在那一刻,我们不得不把GPT-3当作一个有意识的个体来对待,尽管我们完全清楚它并没有意识。人工智能是否具有意识的问题最近备受关注;一名谷歌员工因与公司AI项目LaMDA是否具有意识的争议被停职,谷歌表示停职的原因是数据保密违规。在成功提交之后,我们开始反思自己刚刚做了什么。如果这篇论文被接受,这意味着从今以后,期刊编辑是否要求每位投稿人必须证明他们未使用GPT-3或其他算法的帮助?如果他们确实使用了,是否需要将算法列为共同作者?如何向一个非人类的作者请求修改文本和采纳建议?除了署名方面的细节,这样一篇文章的存在,彻底打破了传统科学论文线性结构的观念。几乎整篇论文——包括引言、方法和讨论部分——其实都是我们提出问题的结果。如果由GPT-3生成内容,其记录需要以某种方式呈现,但又不能扰乱文本的流畅性。如果每一段由AI生成的文字前都添加方法部分,那看起来会很奇怪。因此,我们必须发明一种全新的方式来呈现一篇我们实际上并未撰写的论文。我们不想过多解释我们的流程,因为我们觉得那样会削弱论文的目的。整个过程感觉就像电影《记忆碎片》中的场景:叙事的起点在哪里?我们又如何到达终点?我们不知道我们选择的这种论文展示方式,未来是否会成为GPT-3共同撰写研究的典范,或者是否会成为一则警示故事。只有时间——以及同行评审——才能给出答案。目前,GPT-3的论文已经被我们投稿的学术期刊分配了一位编辑,现在已经发布在法国国际预印本服务器HAL上。这个非同寻常的“主要作者”可能是导致调查和评估过程延长的原因。我们正热切期待着,如果论文得以发表,它将对学术界意味着什么。也许我们会逐步摆脱以论文数量为基础来决定研究基金和财务安全的现状。毕竟,在我们的AI第一作者的帮助下,我们每天都可以发表一篇论文。也许这会什么都不会带来。第一作者身份仍然是学术界最渴求的荣誉之一,这一点不太可能因为一个非人类的第一作者而改变。归根结底,这取决于我们将如何在将来看待人工智能:是将其视为一个合作伙伴,还是仅仅作为一种工具。也许现在看起来这似乎是一个简单的问题,但再过几年,谁知道这项技术会激发出怎样的难题,而我们必须解决它们呢?我们所知道的就是,我们已经打开了一扇门。我们只是希望,这扇门不会打开一个潘多拉魔盒。这是一篇观点和分析文章,作者表达的观点不一定代表《科学美国人》的观点。
您觉得本篇内容如何
评分

评论

您需要登录才可以回复|注册

提交评论

广告
提取码
复制提取码
点击跳转至百度网盘