Resources > Q&A

Is ChatGPT going to replace Systematic Review tools?

Is ChatGPT going to replace Systematic Review tools?

Our last webinar on ChatGPT and Systematic Review Tools was enormously successful. Thank you for all your active participation!

The theme sparked a great deal of interest. During the event, we welcomed a very active audience of more than 400 participants from all over the world, representing various fields of expertise. More than a thousand people registered for the event, and they also received early access to the recording.

We received a lot of feedback during the webinar and plenty of questions. Here is a little summary of our Q&A session, where we answer all your inquiries.

GENERAL AI TOOLS QUESTIONS

Is literature search, in a systematic manner, possible by Artificial Intelligence (AI) tools?

It is possible to conduct validation of search strategies and find similar documents using AI. Can we use AI tools for the screening of articles for meta-analysis?AI models support screening in many ways: by automatically eliminating less relevant records or prioritizing records according to the probability of inclusion during the screening of records.

What is possible, or maybe possible, in the near future, in terms of creating search strategies using AI?

Existing AI tools can make the search process more efficient and partially automated by generating terms, visualizing and assessing search queries, tracing citation connections, limiting search results (i.e., study classifiers), or translating searches across databases. We anticipate that in the future, the need for a two-stage process with distinct search and screening components will become obsolete.

What are your thoughts on peer-reviewing when using AI for systematic reviews? Is "another" AI tool going to replace peer reviewing?

AI and Machine Learning face a massive problem with peer reviews. Peer review goes beyond just research papers, as it also occurs at conferences. Richard Zucker’s instrumental paper on Control Strings was initially rejected in peer review, just to give you one example. There are many stories like this in every field, but generally, people recognize that there is a peer review problem and they are trying to apply Machine Learning models to tackle this as well.

What about ethical or authenticity concerns while using AI solutions to speed up the steps within a research project? Is there a specific limit to how many stages of a research project we can use any AI solution to ensure that there is still substantial author contribution?

Systematic Review guidelines have not fully addressed ethical issues related to automatisation, but this should change in the future. To assure transparency, according to PRISMA2020 guidelines, we should provide details of automation tools used in the process, i.e., how many records were excluded by a human and how many were excluded by automation tools.

Who has responsibility for the errors or incorrect performance of AI systems?

Current evidence suggests that chatbots like GPT may have a problem with differentiation between reliable and unreliable sources, which significantly limits the ability of chatbots to gather all relevant data. In this case, human experts should provide oversight and validation of the results produced by AI models. In 2019, the European Union created ethical guidelines for trustworthy AI, in which authors emphasise the necessity of ensuring safety and privacy and providing accountability mechanisms. Organisations focusing on Evidence-Based Medicine will likely follow the same path and create similar guidelines focused on using AI in evidence synthesis.  

How does Laser AI compare to other systematic review platforms that use AI?

The unique feature of Laser AI is that it was designed and developed as an “AI native” system. AI is central to how Laser AI functions. It is not the case with many tools on the market, especially the older ones.

ChatGPT QUESTIONS

What is the difference between GPT-4 and LaserAI?

GPT-4 is an extensive model. We do not really know how large it is. It has trillions of parameters and was developed by OpenAI. It is available either through ChatGPT Plus or through API.

Laser AI is a tool for systematic reviews that we produce at Evidence Prime, and we make use of different available models, including GPT-4.

If ChatGTP-4, or future versions, can focus on specific known, peer-reviewed data sets - will they be reliable enough to be a base of output for a systematic review?


We are not confident about what exactly ChatGTP-4 was trained on and probably will not be able to find out. During Information Retrieval, we just ask the model to provide its best guess to references. We can try to validate its answers, but we are unsure what data wasgiven in the training set and whether some studies are included, or others are not. The massive amount of this data makes it difficult to control the results at this stage. We cannot easily verify what is there. So generally, it is essential to perform the systematic review process thoroughly.

Can we use ChatGPT for systematic reviews of Outcome Measures using PRISMA guidelines and COSMIN methodology?

Currently, there are no recommendations for nor against using ChatGPT in systematic reviews, including those related to outcome measures. It is certainly helpful in topic exploration and finding interesting examples when we use it as a search engine. In the field of outcome measures, ChatGPT may be helpful for the basic exploration of the available outcome measurements and measurement instruments and basic exploration for the developing Core Outcome Set. However, it should be emphasized that chatbots are still not validated enough to rely on their answers. They should be used for topic exploration rather than further analysis. Moreover, results must be validated by human experts.

Can ChatGPT be used for something more straightforward, like data extraction from published studies?

Yes, based on our evaluations, it works well, mostly if you use the GPT-4 model, since this is currently the only one that can use plugins simultaneously to validate across many fields and studies. This research is in progress, and we do not believe that anybody has concrete data on how well it works. For this reason, we also think that having a user interface where you can see the predictions next to the PDF, and have a link between the two, is also helpful.

One of the challenges when doing a Systematic Review is including grey literature. Would ChatGPT be a good tool to use, besides the other databases, to identify additional studies that were not identified?

No, it is not a good use case, mostly, because it’s easy to trigger, a so-called “hallucination” or a mode of making things up. The other reason is that requesting another set of results is difficult after presenting a complete knowledge base. There is not a lot of research on this topic, but we suggest tools that can search across a larger number of indexed databases, or Google Scholar, as a better way to find grey literature.

What are your thoughts on authors who use ChatGPT and other AI technologies as co-authors?

We think that ChatGPT is a tool and not a separate entity. Inventors cannot include AI tools as a co-inventor on patents, as the court does not accept it. So, the question here is - as these systems become more capable, will we consider them as co-authors instead of just tools that draft something requiring human input and validation, or do these systems need to gain some sort of agency?

TOTAL AUTOMATION IN THE FUTURE

Is total automation of systematic reviews possible in the future?

Perhaps, it is possible, but we can think about Total Automation in two ways.  It's a little bit like so-called “Mechanical Turks”. For example, “automatic chess players” were created in the 19th century, but in reality, a real person was “sitting inside of the machine” playing chess and moving pieces. However, from the outside, it looked like a real machine. Using this thought experiment, let's assume that we have this technology today, but in reality, on the other side, we have a team of researchers. We would prefer this machine to give us detailed responses across all the stages of systematic review instead of just giving us the final manuscript, regardless of the automation possibility and the timeline. We believe that we still need to follow the process and we need the responses from the automation to be at the same level of detail as we have currently from human researchers.

I would suggest that systematic reviews will be conducted in the same way in the future, but the use case for systematic reviews will become obsolete more and more. Imagine creating living guidelines with GAI (e.g. during a pandemic). The evidence gathered might not be perfect, but the synthesis could be conducted within a day instead of within months. Practioners might instead ask an LLM with a particular review question instead of researching systematic reviews that only cover their question 70%. Do you think that there will still be a use case for systematic reviews in the future?

If we had a system that could conduct systematic reviews in a few minutes, we could possibly create all the living systematic reviews we need. For example, if we start with populations and then have living systematic reviews on all the interventions we can conduct in a given population and update them with living systematic reviews, then we could provide systematic reviews on demand. The bigger question is about research transparency - the assessment of the quality of these studies. We also need to consider that the production of studies and evidence will be optimized. Thus, we have robots that will do experiments and then create and publish papers.

HOW TO TRY OUT LASER AI?

Is the cost of Laser AI likely to be within the reach of academic researchers?

We are working on an academic pricing model. Stay tuned. In the meantime, contact us at info@evidenceprime.com so we can evaluate your specific need.

Is there a free open version of Laser AI, especially for non-funded research projects?

Unfortunately, not due to our costs in running all these models; however, we are available for academic collaborations for projects done together, and then we are always happy to provide the tool free of charge. So please reach out to us so we can see what is possible.

Is it possible to try the platform for ongoing systematic reviews?

Reach out to us at info@evidenceprime.com so we can evaluate your specific need.

HOW TO JOIN OUR TEAM

Are open jobs at LaserAI posted on your website? Is there a way to join the talent pool?

Please check our website https://www.evidenceprime.com/jobs regularly and follow us on our social media for more updates:
LinkedIn - https://www.linkedin.com/company/evidence-prime/
Twitter - https://twitter.com/living_reviews
Facebook - https://www.facebook.com/evidenceprime
Youtube - https://www.youtube.com/@EvidencePrime


If you have missed this opportunity and would still like to watch the recording of the sessions, you can access it here.
If you would like to learn more about the Eagle of Innovation Comperirion, read our
BLOG.
Read our ISPOR Attendence Summary here.
Join our AI in EBHC LinkedIn Group NOW.
If you would like to learn more about how Laser AI keeps your data safe, visit our security corner and check our Trust Center.