Algorithm That Detects Sepsis Cut Deaths by Nearly 20 Percent

2022-08-02 23:32:26
关注

Hospital patients are at risk of a number of life-threatening complications, especially sepsis—a condition that can kill within hours and contributes to one out of three in-hospital deaths in the U.S. Overworked doctors and nurses often have little time to spend with each patient, and this problem can go unnoticed until it is too late.

Academics and electronic-health-record companies have developed automated systems that send reminders to check patients for sepsis, but the sheer number of alerts can cause health care providers to ignore or turn off these notices. Researchers have been trying to use machine learning to fine-tune such programs and reduce the number of alerts they generate. Now one algorithm has proved its mettle in real hospitals, helping doctors and nurses treat sepsis cases nearly two hours earlier on average—and cutting the condition’s hospital mortality rate by 18 percent.

Sepsis, which happens when the body’s response to an infection spirals out of control, can lead to organ failure, limb loss and death. Roughly 1.7 million adults in the U.S. develop sepsis each year, and about 270,000 of them die, according to the Centers for Disease Control and Prevention. Although most cases originate outside the hospital, the condition is a major cause of patient mortality in this setting. Catching the problem as quickly as possible is crucial to preventing the worst outcomes. “Sepsis spirals extremely fast—like in a matter of hours if you don’t get timely treatment,” says Suchi Saria, CEO and founder of Bayesian Health, a company that develops machine-learning algorithms for medical use. “I lost my nephew to sepsis. And in his case, for instance, sepsis wasn’t suspected or detected until he was already in late stages of what’s called septic shock..., where it’s much harder to recover.”

But in a busy hospital, prompt sepsis diagnosis can be difficult. Under the current standard of care, Saria explains, a health care provider should take notice when a patient displays any two out of four sepsis warning signs, including fever and confusion. Some existing warning systems alert physicians when this happens—but many patients display at least two of the four criteria during a typical hospital stay, Saria says, adding that this can give warning programs a high false-positive rate. “A lot of these other programs have such a high false-alert rate that providers are turning off that alert without even acknowledging it,” says Karin Molander, who is an emergency medicine physician and chair of the nonprofit Sepsis Alliance and was not involved in the development of the new sepsis-detection algorithm. Because of how commonly the warning signs occur, physicians must also consider factors such as a person’s age, medical history and recent lab test results. Putting together all the relevant information takes time, however—time sepsis patients do not have.

In a well-connected electronic-records system, known sepsis risk factors are available but may take time to find. That’s where machine-learning algorithms come in. Several academic and industry groups are teaching these programs to recognize the risk factors for sepsis and other complications and to warn health care providers about which patients are in particular danger. Saria and her colleagues at Johns Hopkins University, where she directs the Machine Learning and Healthcare Lab, began work on one such algorithm in 2015. The program scanned patients’ electronic health records for factors that increase sepsis risk and combined this information with current vital signs and lab tests to create a score indicating which patients were likely to develop septic shock. A few years later, Saria founded Bayesian Health, where her team used machine learning to increase the sensitivity, accuracy and speed of their program, dubbed Targeted Real-Time Early Warning System (TREWS).

More recently, Saria and a team of researchers assessed TREWS’s performance in the real world. The program was incorporated over two years into the workflow of about 2,000 health care providers at five sites affiliated with the Johns Hopkins Medicine system, covering both well-resourced academic institutions and community hospitals. Doctors and nurses used the program in more than 760,000 encounters with patients—including more than 17,000 who developed sepsis. The results of this trial, which suggest TREWS led to earlier sepsis diagnosis and reduced mortality, are described in three papers published in npj Digital Medicine and Nature Medicine late last month.

“I think that this model for machine learning may prove as vital to sepsis care as the EKG [electrocardiogram] machine has proved in diagnosing a heart attack,” Molander says. “It is going to allow the clinician to go from the computer..., trying to analyze 15 years’ worth of information, to go back to the bedside and reassess the patient more rapidly—which is where we need to be.”

TREWS is not the first program to demonstrate its value in such trials. Mark Sendak, a physician and population health and data science lead at the Duke Institute for Health Innovation, works on a similar program developed by Duke researchers, called Sepsis Watch. He points out that other machine-learning systems focused on health care—not necessarily those created for sepsis detection in particular—have already undergone large-scale trials. One groundbreaking test of an artificial-intelligence-based system for diagnosing a complication of diabetes was designed with input from the U.S. Food and Drug Administration. Other programs have also been tested in multiple different hospital systems, he notes.

“These tools have a valuable role in improving the way that we care for patients,” Sendak says, adding that the new system “is another example of that.” He hopes to see even more studies, ideally standardized trials that involve research support and guidance from external partners, such as the FDA, who don’t have a stake in the results. This is a challenge because it is extremely difficult to design health care trials of machine-learning systems, including the new studies on TREWS. “Anything that takes an algorithm and puts it into practice and studies how it’s used and its impact is phenomenal,” he says. “And doing that in the peer-reviewed literature—massive kudos.”

As an emergency room physician, Molander was impressed by the fact that the AI does not make sepsis decisions on behalf of health care providers. Instead it flags a patient’s electronic health record so that when doctors or nurses check the record, they see a note that the patient is at risk of sepsis, as well as a list of reasons why. Unlike some programs, the alert system for TRWES does not prevent “the clinician from doing any other further work [on the computer] without acknowledging the alert,” Molander explains. “They have a little reminder there, off in the corner of the system, saying, ‘Look, this person is at higher risk of decompensation [organ failure] due to sepsis, and these are the reasons why we think you need to be concerned.’” This helps busy doctors and nurses prioritize which patients to check on first without removing their ability to make their own decisions. “They can choose to disagree because we don’t want to take autonomy away from the provider,” Saria says. “This is a tool to assist. This is not a tool to tell them what to do.”

The trial also gathered data on whether doctors and nurses were willing to use an alert system such as TREWS. For instance, 89 percent of its notifications were actually evaluated rather than dismissed automatically, as Molander described happening with some other systems. Health care providers’ willingness to check the program could be because TREWS cut the high rate of false sepsis-warning notifications by a factor of 10, according to a press release from Bayesian Health, reducing the barrage of alerts and making it easier to distinguish which patients were in real danger. “That’s mind-blowing,” Molander says. “That is really important because it allows providers to increase their trust in machine learning.”

Building trust is important, but so is collecting evidence. Health care institutions would not be likely to accept machine-learning systems without proof they work well. “In tech, people are much more willing to adopt new ideas if they believe in the thought process. But in medicine, you really need rigorous data and prospective studies to support the claim to get scalable adoption,” Saria says.

“In some sense, we are building the products while also building the evidence base and the standards for how the work needs to be conducted and how potential adopters need to be scrutinizing the tools that we’re building,” Sendak says. Achieving widespread adoption for any algorithmic alert system is challenging because different hospitals may use different electronic-records software or may already have a competing system in place. Many hospitals also have limited resources, which makes it difficult for them to assess the effectiveness of an algorithmic alert tool—or to access technical support when such systems inevitably require repairs, updates or troubleshooting.

Still, Saria hopes to use the new trial data to expand the use of TREWS. She says she is building partnerships with multiple electronic-records companies so she can incorporate the algorithm into more hospital systems. She also wants to explore whether machine-learning algorithms could warn about other complications people can experience in hospitals. For instance, some patients must be monitored for cardiac arrest, heavy bleeding and bedsores, which can impact health during hospital stays and recuperation afterward.

“We’ve had a lot of learning around what ‘AI done right’ looks like, and we’ve published significantly on it. But what this is now showing is AI done right actually gets provider adoption,” Saria says. By incorporating an AI program into existing records systems, where it can become part of a health care provider’s workflow, “you can suddenly start chopping your way through all these preventable harms in order to improve outcomes—which benefits the system, benefits the patient and benefits the clinicians.”

参考译文
检测败血症的算法将死亡率降低了近20%
住院患者面临多种危及生命的并发症风险,尤其是败血症——这是一种可能在几小时内致命的疾病,并导致美国医院每三例死亡中就有一例。超负荷的医生和护士往往没有足够的时间与每位患者相处,而这个问题常常在为时已晚时才被察觉。学术界和电子健康记录公司已开发了自动化系统,用于提醒医护人员检查患者的败血症情况,但过多的警报反而会导致医护人员忽视或关闭这些提醒。研究人员一直在尝试利用机器学习来优化这些程序,减少其发出的警报数量。如今,一种算法已经在真实医院中经受住了考验,帮助医生和护士平均提前近两个小时识别败血症病例,并将这一疾病的医院死亡率降低了18%。败血症是一种由身体对感染的反应失控所引起的状况,可能导致器官衰竭、肢体丧失和死亡。根据美国疾病控制与预防中心的数据,美国每年约有170万名成年人发生败血症,其中有约27万人死亡。尽管大多数病例始于医院之外,但败血症是医院内患者死亡的主要原因之一。尽快发现这一问题是避免最糟后果的关键。“败血症的恶化非常迅速——如果你无法及时接受治疗,可能在几个小时内就出现致命后果。”贝叶斯健康公司(Bayesian Health)的首席执行官兼创始人Suchi Saria说道,该公司开发了用于医疗的机器学习算法。“我失去了我的侄子,死于败血症。在他身上,败血症直到他进入所谓的败血性休克的晚期阶段才被怀疑或检测到,那时恢复起来已经非常困难。”但在繁忙的医院中,及时诊断败血症是困难的。Saria解释说,按照目前的护理标准,当患者出现四种败血症预警信号中的任意两种时,医护人员就应予以关注,包括发烧和意识混乱。有些现有的预警系统在发生这种情况时会提醒医生,但Saria表示,许多患者在住院期间至少会表现出这四种标准中的两种,这可能导致预警程序的误报率较高。“许多其他程序的误报率非常高,以至于医护人员在尚未确认之前就会关闭这些警报。”卡林·莫兰德(Karin Molander)说。她是非营利性败血症联盟的主席,也是一名急诊医学医生,但并未参与新败血症检测算法的开发。由于这些预警信号非常常见,医生还必须考虑诸如年龄、病史和近期实验室检查结果等因素。然而,将这些相关信息整合起来需要时间,而败血症患者是无法等待的。在一个联网良好的电子记录系统中,已知的败血症风险因素是可获取的,但可能花费一些时间查找。这时,机器学习算法就派上用场了。一些学术和行业团队正在训练这些程序,使其能够识别败血症及其他并发症的风险因素,并提醒医护人员哪些患者处于特殊危险中。Saria和她在约翰·霍普金斯大学的同事——她在那里担任机器学习与医疗实验室的主任——于2015年开始了这样一个算法的开发。该程序会扫描患者的电子健康记录,寻找增加败血症风险的因素,并将这些信息与当前的生命体征和实验室检查结果结合起来,生成一个分数,以判断哪些患者有可能发展为败血性休克。几年后,Saria创立了贝叶斯健康公司,在那里,她和团队利用机器学习提高了程序的敏感性、准确性和速度,该程序被称为目标实时预警系统(TREWS)。最近,Saria和一个研究团队评估了TREWS在现实中的表现。该程序在两年内被整合到约翰·霍普金斯医学院系统下五个地点的约2000名医护人员的工作流程中,涵盖资源充足的学术机构和社区医院。医生和护士在超过76万次与患者的接触中使用了该程序,其中包括17000多例出现败血症的患者。这项试验的结果表明,TREWS促使败血症更早被诊断,并降低了死亡率,相关内容发表在《npj数字医学》和《自然医学》上月晚些时候刊登的三篇论文中。“我认为,这种机器学习模型对于败血症护理来说可能如同心电图(ECG)机在诊断心脏病方面一样至关重要。”莫兰德表示。“它将使临床医生能够从计算机上分析15年来的信息,回到床边更迅速地重新评估患者——这才是我们真正需要的。” TREWS并不是第一个在类似试验中证明其价值的程序。杜克创新健康研究所(Duke Institute for Health Innovation)的人口健康与数据科学负责人、医生马克·森达克(Mark Sendak)正在参与一项由杜克研究人员开发的类似程序“败血症警报”(Sepsis Watch)的研究。他指出,一些专注于医疗保健的机器学习系统——不一定是专门用于败血症检测的——也已经进行了大规模试验。一项革命性的基于人工智能的系统诊断糖尿病并发症的测试就是在食品药品监督管理局(FDA)的协助下进行的。他指出,其他程序也已在多个不同的医院系统中进行了测试。“这些工具在改善我们对患者护理方式方面具有重要的作用,”Sendak表示,并补充说新系统“是其中的一个典范”。他希望看到更多的研究,特别是涉及外部合作伙伴如FDA提供研究支持和指导的标准化试验,而这些合作伙伴对结果没有利益关系。这是一个挑战,因为设计用于机器学习系统的医疗保健试验极其困难,包括新的TREWS研究。“任何将算法投入实践并研究其使用方式和影响的研究都令人惊叹,”他说。“而如果这种研究还能发表在同行评审的文献中,那就更加可贵。”作为一名急诊室医生,莫兰德对AI不会替医护人员做败血症决定这一事实印象深刻。相反,它会标记患者的电子健康记录,使得当医生或护士查看记录时,能看到患者败血症风险的提示,以及相关的解释原因。莫兰德解释说,与一些程序不同,TRWES的警报系统并不会在医护人员未确认警报之前阻止他们继续在电脑上进行其他工作。“在系统的一角,有个小小的提醒,写着‘请注意,这个人发生器官衰竭(败血症引起的)的风险更高,而我们相信你必须关注的原因是……’”这有助于忙碌的医生和护士优先检查哪些患者,而不会剥夺他们自行做决定的能力。“他们可以选择不同意,因为我们不希望剥夺医护人员的自主权,”Saria说。“这是一个辅助工具,不是告诉他们该怎么做。” 试验还收集了医生和护士是否愿意使用类似TREWS的警报系统的数据。例如,89%的提示实际上被评估了,而非像莫兰德描述的那样被自动忽略。根据贝叶斯健康公司发布的新闻稿,医护人员愿意检查该程序的原因之一是TREWS将高误报率的败血症警告减少了10倍,这减少了信息洪流,使医护人员更容易识别真正处于危险中的患者。“这令人震惊,”莫兰德说。“这非常重要,因为它可以让医护人员增加对机器学习的信任。”建立信任很重要,但收集证据也同样重要。医疗机构不太可能接受机器学习系统,除非有证据证明它们有效。“在科技领域,如果人们相信背后的想法,他们更愿意接受新观念。但在医学领域,你需要严格的数据和前瞻性研究来支持声明,才能实现可扩展的采用,”Saria说道。“从某种意义上说,我们不仅在构建产品,同时也在构建证据基础,以及这些工作应该如何进行的标准,以及潜在采用者应该如何审查我们所构建的工具。”Sendak说。实现任何算法警报系统的广泛应用具有挑战性,因为不同的医院可能使用不同的电子记录软件,或者已经拥有竞争系统。许多医院资源有限,这使得他们难以评估算法警报工具的有效性,或在这些系统不可避免地需要修复、更新或排查问题时获得技术支持。尽管如此,Saria希望利用新的试验数据来扩大TREWS的使用。她说,她正在与多家电子记录公司建立合作关系,以便将该算法整合到更多的医院系统中。她还希望探索机器学习算法是否可以预警医院内可能发生的其他并发症。例如,一些患者需要监测心脏骤停、严重出血和褥疮,这些都可能影响住院期间的健康,以及之后的康复。“我们对‘正确使用AI’的模样已经有了很多认识,并且在这方面发表了很多成果。但这次的结果显示,正确使用AI真的能获得医护人员的采纳,”Saria说道。通过将AI程序整合到现有的记录系统中,使其成为医护人员工作流程的一部分,“你突然可以开始消除所有这些可预防的危害,从而改善结果——这将有利于医疗系统、患者和医护人员。”
您觉得本篇内容如何
评分

评论

您需要登录才可以回复|注册

提交评论

广告

scientific

这家伙很懒,什么描述也没留下

关注

点击进入下一篇

将AI成功应用到任何商业运作的10种方法

提取码
复制提取码
点击跳转至百度网盘