Resources > Publications >
ISPOR EU 2024 Poster
Authors: Petra Nass, Dominika Rekowska, Emanuele Arcà, Artur Nowak, Ewelina Sadowska, Ewa Borowiack, Nick Halfpenny
Introduction
Systematic literature reviews (SLRs) play an important role in health technology assessments (HTAs) but are often time-consuming and resource-intensive. The growing volume of published research and tight deadlines add to the complexity and workload of preparing high-quality SLRs. Artificial intelligence (AI) has emerged as a promising solution to accelerate the review process, particularly by taking on the role of a second reviewer during the title and abstract (TIAB) screening stage. Evidence Prime's Laser AI platform provides AI-supported SLR screening and data extraction while ensuring that human reviewers remain in control of key decisions.
Objectives
This study aimed to evaluate both the accuracy and workload reduction achieved by using Evidence Prime's Laser AI platform in a real-world case study. The case involved a comprehensive SLR with eight updates, allowing AI performance to be assessed across multiple review iterations.
Methods
The analysis was based on a previously completed SLR investigating biologic treatments for Crohn's disease (CD), which had been updated eight times. Initially, TIAB screening for each update was conducted by two human reviewers, with a third reviewer resolving any conflicts. To test AI performance, Laser AI was trained using the original eligibility criteria and the inclusion/exclusion decisions from previous screening rounds. The AI then conducted the screening for each of the eight updates, and its decisions were compared against those made by the human reviewers.
Results
Using Laser AI resulted in a 45.4% reduction in human screening effort compared to dual human screening. Across all updates, the AI achieved a sensitivity of 95.8%, specificity of 88.5%, positive predictive value (PPV) of 8.5%, and negative predictive value (NPV) of 100%. In seven out of eight updates, the AI identified all studies included by human reviewers, achieving 100% sensitivity. In one update, the AI missed a single relevant study, resulting in 66.7% sensitivity for that round; however, the missed study was identified in the subsequent update.
Conclusions
The findings support the use of AI as a second reviewer during TIAB screening in SLR updates. While high sensitivity may increase the number of full-text articles to assess, the overall reduction in manual screening effort remains significant when AI replaces a human reviewer. Moreover, retraining the AI after each update could enhance accuracy and reduce workload, making it a valuable tool for streamlining the SLR process.
Related webinars:
Related blog posts: