Your AI coding assistant is a hot mess

2023-06-23 16:06:00
关注

  •  

Joe Reeve, an engineering manager at digital analytics platform Amplitude, used AI coding assistant Github Copilot to produce a segment of one of his recent coding projects. It was a fairly simple, if time-consuming, function — one that Reeve had written by himself plenty of times before. “It saved me 25 minutes of writing,” he recalls. “About two hours later, I hit a bug in the code. It took me another two to three hours to figure out what the issue was.” 

The culprit? The AI tool had made a tiny but significant mistake in the code by switching the direction of a single ‘greater-than’ sign. “Since then, I’ve been much more sceptical of the code that it generates,” says Reeve – but he doesn’t want to quit using it just yet. “It’s a very, very powerful tool.” 

Github Copilot, Reeve’s AI tool of choice, was launched for general use in June 2022 at a monthly cost of $10 per individual user. It quickly became one of the most widely used coding assistants, offering autocomplete-style suggestions to curious or overworked developers. Competitors quickly emerged, including Baidu’s Comate, and Amazon’s CodeWhisperer, which was made free to individual users in April — undercutting the $10 price tag of Github Copilot. And, of course, there’s the biggest name in the AI business: OpenAI’s free-to-use ChatGPT, which sparked immense interest when it was launched last November. Although ChatGPT was designed primarily for natural language processing tasks, it’s proven pretty effective at writing and debugging code, even if it sometimes displays an alarming proclivity for hallucinations.

Some developers say these tools are so helpful they’ll soon become mandatory. Others aren’t so sure. Indeed, some of the world’s biggest companies remain so nervous about implementing largely untested AI that they currently prohibit access to the tools. Samsung, Amazon and Verizon have completely barred the use of ChatGPT, citing security concerns. Apple, too, has restricted the use of both ChatGPT and Github Copilot over fears of data leaks — especially as it works to develop its own, rival coding assistants.

These fears, however, don’t seem to have stymied the tools’ rapid growth. In an earnings call in January, Satya Nadella, CEO of Github’s parent company Microsoft, said that Github Copilot had already surpassed one million users. In a recent study by consulting firm Bain & Company, 57% of surveyed software CTOs and engineering leaders said they were actively rolling out AI coding assistants. They cited increased speed, quality improvements, and lower costs as the tools’ primary benefits. 

CEO of Github, the company behind AI coding assistant Copilot
Thomas Dohmke, CEO of GitHub, the company behind Copilot, speaks at Web Summit Rio 2023 in Rio de Janeiro, Brazil. (Photo By Vaughn Ridley/Sportsfile/Getty Images)

Who’s using AI coding assistants?

Almost all of the software developers that spoke to Tech Monitor are primarily using Github Copilot, above other AI assistants, to support their work — a fact that reflects the tool’s clear market dominance and branding advantage. 

Most people at Amplitude have access to GitHub Copilot, explains Reeve. The company recently started providing paid subscriptions to the software, but a lot of its developers were already using their own accounts beforehand. “Engineers just started using [it] because it made their lives significantly easier,” even though they “have to treat it with a level of distrust,” says Reeve. His team have found AI assistants to be particularly useful for reviewing code, which can often be a frustrating and time-consuming task. “This is where tools like ChatGPT can solve existing challenges — by helping engineers quickly understand old systems and code,” says Reeve.

Mohanjith Sudirikku Hannadige, CTO at Finnish aqua-fitness startup Hydrohex, is another fan of Copilot. “It frees up developers from mundane tasks” and “makes work more enjoyable,” he says. Although human oversight remains essential for correcting the tool’s occasional mishaps, Hannadige estimates that Hydrohex’s engineers now complete their coding tasks twice as fast as they did before adopting the assistant in March. 

Content from our partners

A renewed demand for film rewards Kodak’s legacy

A renewed demand for film rewards Kodak’s legacy

Why plugging the sustainability skills gap is key to ESG

Why plugging the sustainability skills gap is key to ESG

Adaptability will shape the future of distributors

Adaptability will shape the future of distributors

Christian Desrosiers, co-founder of AI start-up Visceral, says his team has also begun using tools like ChatGPT and Github Copilot, as well as building specialist in-house coding assistants. “We found the biggest immediate productivity gains when writing boilerplate code for stand-alone app components – for example, those that do things like interact with APIs,” he says.

View all newsletters Sign up to our newsletters Data, insights and analysis delivered to you By The Tech Monitor team

Meanwhile, Perforce Software’s CTO Rod Cope says he always uses Github Copilot when producing his own code. “I’m starting to think of it like a remote pair-programmer,” he says. “They can kind of look over your shoulder and go: ‘Oh, What about that?’” The suggestions might not always be wholly accurate, he says, but they’re increasingly useful as jumping-off points — helping to eliminate the dread-inducing sight of a blank screen. 

There are, however, some notable limitations. While there’s plenty of training data available for the most popular coding languages, like R and Python, AI tools might display stunted abilities in the more niche languages. They also might not be particularly useful for more ambitious projects. “These models are trained on code that already exists,” says Reeve, “meaning the more novel or specific your use case, the less useful they’ll be.”  

Risky business

As testified by Reeve’s wasted hours of bug-hunting, AI tools certainly aren’t foolproof. They’re often trained on open-source code, which frequently contains bugs – mistakes that the assistant is prone to replicating. They’re also notoriously prone to wild delusions, a fact, says Desrosiers, that cybercriminals can use to their advantage. AI coding assistants are liable to occasionally make up the existence of entire coding libraries. “Malicious actors can detect these hallucinations and launch malicious libraries with these names,” he says, “putting at risk people who let these hallucinated libraries execute in their production environment.” 

Careful oversight, says Desrosiers, is the only solution. That, too, can be facilitated by AI. “To de-risk this and other potential issues [at Visceral], we build single-purpose autonomous coding assistants to monitor for such threats,” says Desrosiers.

David Mertz says it’s always important to not be too trusting. “From a security perspective, you just can’t trust code,” says the author and long-time Python programmer. But while Mertz agrees that constant oversight is essential when using AI, he argues that this, in practice, is little different from hiring a junior programmer. 

“There’s a […] difference in the kind of mistakes that inexperienced programmers make versus those that machines make, but they both make mistakes,” he says. Some organisations will always put themselves — and their clients — at risk by performing insufficient, or inadequate, safety checks, “but that’s not that’s not a new danger introduced by machines”. 

Perhaps the biggest risk, then, is simply misplaced faith in AI. Indeed, in a paper published in December 2022, a study from Stanford University found that AI tools can leave developers “deluded” about the quality of their work. Researchers found that participants with access to an AI coding assistant often produced more security vulnerabilities than those without access, yet were simultaneously more likely to believe that they’d written secure code. 

James Hodson, CTO of TechAid, echoes this concern. The use of AI tools, he argues, “encourages less oversight of the engineering process, and a lower level of skilled human engagement, which ultimately leads to more security vulnerabilities, harder-to-maintain codebases, and a dilution of the human-capital skills base.” These flaws, he says, are inherent to the nature of LLMs like ChatGPT and Github CoPilot. “Software engineering, to ensure high-quality, maintainability, and long-term fit for purpose, is an engineering process — not a linguistic generation process.” 

Software development and coding
AI tools such as ChatGPT and Github Copilot can save developers’ time, but only if they’re used carefully. Otherwise, they can create a host of challenges. (Photo by DC Studio/Shutterstock)

Is coding dead?

So, software developers probably aren’t out of a job — at least not yet. “It’s not a panacea and it’s not something that’s going to replace effective programmers,” says Mertz. “It just may be something that makes us more productive.”

Indeed, future developers will still need to have a firm grasp of coding in order to make the most of these tools — even if they improve dramatically. “If you don’t know how to code, the code that the AI assistants generate for you will always look right,” says Cope. This, he adds, means you probably won’t immediately notice nasty bugs that’ll be much tougher to tackle further down the line.  

Even so, tools such as Copilot and ChatGPT might, ultimately, make developers’ jobs more satisfying. “Some of them will be very resistant because it’ll feel like it’s taking away what’s special about what they’ve learned,” says Cope. “But I think, for the vast majority of developers, the minutiae is just tedious overhead.” 

Reeve is equally optimistic about the future of software engineering. “I think what’s considered coding is just going to change,” he says. “It used to be that coding was punching holes in bits of cardboard and feeding them through a machine […] Now, really, a lot of the software engineering that we do is thinking about names and structuring code and moving code around.” 

The rise of AI assistants, Reeve believes, could further elevate the craft. “Hopefully it means that, as humans, we’ll focus on more of the cutting-edge things,” he says, “because all the other things are going to become much easier.”

Read more: This is how GPT-4 will be regulated

  •  

参考译文
你的 AI 编程助手一团糟
乔·里夫(Joe Reeve)是数据分析平台Amplitude的工程经理,他曾利用AI编程助手GitHub Copilot完成最近一个编码项目的一部分。这是一项相对简单但费时的函数——是里夫过去自己编写过多次的那种。“它为我节省了25分钟的编写时间,”他回忆道。“大约两个小时后,我发现了代码中的一个错误。花了我另外两到三个小时才弄清楚问题所在。” 问题出在哪里?AI工具在代码中犯了一个微小但关键的错误,将一个“大于号”的方向搞反了。“从那以后,我对它生成的代码就更加怀疑了,”里夫说。“但我还不打算彻底放弃使用它。它是个非常非常强大的工具。”里夫的首选AI工具GitHub Copilot于2022年6月以每位用户每月10美元的价格正式面向大众推出。它迅速成为最广泛使用的编程助手之一,为那些好奇或忙碌的开发者提供类似自动补全的建议。不久之后,竞争对手也纷纷加入,包括百度的Comate和亚马逊的CodeWhisperer,后者自4月起向个人用户免费开放,价格上直接压低了GitHub Copilot的10美元门槛。当然,还有人工智能领域最大的名字:OpenAI的免费工具ChatGPT,它在去年11月推出后立即引起巨大轰动。尽管ChatGPT主要用于自然语言处理任务,但证明它在代码编写和调试方面也有相当不错的表现,尽管有时会表现出令人警觉的“幻觉”现象。一些开发人员表示,这些工具非常有用,很快将成为必需品。但也有人并不确定。事实上,一些世界顶级公司仍因对大多未经验证的AI工具感到忧虑,目前已完全禁止使用。三星、亚马逊和Verizon都完全禁止使用ChatGPT,理由是安全问题。苹果公司也出于对数据泄露的担忧,限制了ChatGPT和GitHub Copilot的使用——特别是因为其正在开发自己的竞争性编码助手。然而,这些担忧似乎并未阻碍这些工具的迅速增长。2023年1月的一次财报电话会议中,GitHub母公司微软的CEO萨提亚·纳德拉(Satya Nadella)表示,GitHub Copilot的用户数已超过100万。根据咨询公司贝恩公司(Bain & Company)最近的一项研究,57%接受调查的软件CTO和工程负责人表示,他们正在积极部署AI编程助手。他们提到,这些工具的主要优势包括提高速度、提升质量以及降低成本。GitHub Copilot的母公司GitHub的CEO托马斯·多姆克(Thomas Dohmke)在2023年里约热内卢Web Summit活动中发言。(图片来源:Vaughn Ridley/Sportsfile/Getty Images)谁在使用AI编程助手?几乎所有接受Tech Monitor采访的软件开发者都在优先使用GitHub Copilot,而不是其他AI助手,来支持他们的工作——这一事实反映了该工具在市场上的明显主导地位和品牌优势。里夫解释说,Amplitude的大多数员工都可以使用GitHub Copilot。公司最近开始为员工提供付费订阅,但很多开发人员此前已经用自己的账户在使用了。“工程师们开始使用它,因为这极大地简化了他们的生活,”尽管他们“必须对它保持一定的怀疑态度。”他说。他的团队发现AI助手在代码审查方面特别有用,这通常是一项耗时费力的任务。“这就是像ChatGPT这样的工具能够解决现有问题的地方——通过帮助工程师快速理解旧系统和代码。”里夫说。芬兰水中健身初创公司Hydrohex的首席技术官Mohanjith Sudirikku Hannadige也是Copilot的粉丝。“它让开发人员摆脱了枯燥的任务,”并“使工作更有趣,”他说。尽管人工监督仍然对于纠正工具偶尔出现的错误至关重要,但Hannadige估计,Hydrohex的工程师们从今年3月采用该助手后,现在完成代码任务的速度是之前的两倍。来自我们合作伙伴的内容 电影奖励需求的回归为何推动Kodak的传承 可持续技能差距的填补为何对ESG至关重要 适应性将塑造分销商的未来 GitHub的首席技术官罗德·科普(Rod Cope)表示,每当自己编写代码时,他总是使用GitHub Copilot。“我开始把它当作一个远程的配对程序员,”他说。“他们可以站在你身后,说:‘哦,那怎么样?’” 尽管这些建议未必总是完全准确,但科普表示,它们越来越适合作为起点——帮助消除面对空白屏幕时的恐惧感。然而,也有一些明显的局限性。尽管流行编程语言(如R和Python)拥有丰富的训练数据,但AI工具在较为小众的语言中可能表现不佳。它们对于更雄心勃勃的项目也可能不够有用。“这些模型是基于现有代码训练的,”里夫说,“这意味着你的用例越新颖或越具体,它们的用处就越小。”危险的生意 里夫花了数小时寻找错误的经历证明,AI工具当然不是万无一失的。它们通常基于开源代码训练,而这些代码中经常包含错误——而这些错误很容易被助手复制。它们也非常容易出现荒谬的幻觉。据德斯罗西埃斯(Desrosiers)说,这种特性可以被网络犯罪分子利用。AI编程助手有时甚至可能编造出整个代码库的存在。“恶意行为者可以检测到这些幻觉,并以这些名字建立恶意库,”他说,“这将危及那些在生产环境中执行这些幻觉库的人。”德斯罗西埃斯表示,谨慎的监督是唯一的解决方案。这同样可以借助AI来实现。“为了降低此类及其他潜在风险,在Visceral,我们构建了专用的自主编程助手,用于监控这些威胁。”他说。作者兼资深Python程序员大卫·默兹(David Mertz)说,永远不要过分信任非常重要。“从安全角度来看,你根本无法信任代码,”他说。尽管默兹同意在使用AI时,持续监督是必要的,但他认为,这种做法在实践中与雇佣初级程序员没什么不同。“在类型上,初级程序员和机器犯的错误可能不同,但它们都会犯错,”他说。一些组织总是通过不足或低效的安全检查,使自己及其客户面临风险,“但这并不是由机器引入的新风险。”那么,最大的风险可能只是对AI的错误信任。事实上,斯坦福大学于2022年12月发表的一篇论文中发现,AI工具可能会让开发者“产生幻觉”,误以为他们的工作质量很高。研究人员发现,可以访问AI编码助手的参与者通常会比无法访问的参与者生成更多安全漏洞,但他们同时更倾向于认为自己编写的代码是安全的。TechAid的首席技术官詹姆斯·霍德森(James Hodson)也表达了类似担忧。他认为,使用AI工具“会减少对工程过程的监督,降低人类的参与水平,最终导致更多的安全漏洞、更难维护的代码库,以及人类技能基础的稀释。”他指出,这些缺陷本质上是ChatGPT和GitHub Copilot等大型语言模型(LLM)所固有的。“为了确保高质量、可维护性和长期适用性,软件工程本质上是一个工程过程——而不是一个语言生成过程。”像ChatGPT和GitHub Copilot这样的AI工具可以节省开发人员的时间,但前提是它们被谨慎使用。否则,它们可能造成一系列问题。(照片来源:DC Studio/Shutterstock)编程是否会消亡?因此,软件开发人员恐怕还不会失业——至少现在不会。“它不是万能的,也不是什么能取代有效程序员的东西,”默兹说。“它只是让我们的工作效率更高。”确实,未来的开发人员仍需具备扎实的编程能力,以最大程度地利用这些工具——即使它们大幅改进。“如果你不知道如何编程,AI助手为你生成的代码看起来总是正确的,”科普说。他补充道,这意味着你可能不会立即注意到那些稍后更难修复的有害错误。尽管如此,Copilot和ChatGPT等工具最终可能会让开发人员的工作更加满足感。“其中一些人会非常抗拒,因为这感觉像是在剥夺他们所学的特殊之处,”科普说。“但我认为,对于绝大多数开发者来说,那些琐碎的工作只是乏味的负担。”里夫对软件工程的未来也同样持乐观态度。“我认为所谓‘编程’的概念将发生变化,”他说。“在过去,编程是用打孔卡打孔然后放进机器中……现在,我们所做的大部分软件工程工作,其实是考虑命名方式、结构代码以及移动代码。”他认为,AI助手的兴起可能会进一步提升编程艺术。“希望这意味着我们人类将更加专注于前沿领域,”他说,“因为所有其他事情都会变得更容易。” 想了解更多信息:这就是GPT-4将如何被监管
您觉得本篇内容如何
评分

评论

您需要登录才可以回复|注册

提交评论

提取码
复制提取码
点击跳转至百度网盘