Launching Conversational Triage: Combining LLMs with Bayesian Models

Piotr Orzechowski

April 4, 2025

... min read

In my previous post, I explained our vision behind the Neuro-Symbolic AI platform as we work toward developing autonomous healthcare agents capable of performing clinical tasks in a medically certified manner. Today, I'm excited to announce that we are taking the first step toward this vision by launching our Conversational Triage agent as the initial use case.

Having had the privilege to work with hundreds of healthcare organizations over the past 13 years, we consistently observed that the most successful implementations occurred when our technology served as a co-pilot during patient-provider interactions, especially in call center settings. This context produced the highest adoption rates, completion rates, patient satisfaction, and most importantly, the greatest adherence to care recommendations. The reason behind this success is simple: patients prefer communicating with a fellow human; it's natural and builds greater trust.

While AI will never fully substitute for experiencing the compassion of a healthcare professional, our ambition for Conversational Triage is to replicate this “human touch” by enhancing our products with improved patient understanding, language capabilities, and empathy similar to speaking with a doctor or a nurse.

At a glance:

What is Conversational Triage?

To our knowledge, Infermedica’s Conversational Triage is the world’s first patient navigation tool to safely and clinically validate the integration of Large Language Models (LLMs) with Bayesian knowledge graphs. This is the result of over 1.5 years of dedicated work by our talented AI, engineering, and medical teams. In November 2024, we launched a usability study involving over 17,000 real users, helping us collect valuable feedback and datasets to refine the prototype. Although there is still much to improve, we are confident that our latest product now provides tangible value and is ready for a broader audience.

You might ask how Conversational Triage is different from ChatGPT, Claude, Perplexity, or other Gen AI chatbots currently available. Let me highlight three key differences:

Clinical validity: Our tool has been clinically validated and is maintained daily by a team of over 40 medical doctors, leveraging more than 140,000 hours of clinician work.
Explainable medical logic: Our clinical reasoning uses Bayesian inference, making it explainable, deterministic, and fully transparent. This approach significantly reduces hallucinations and gives you a complete visibility into the system’s thinking throughout the conversation.
User interface optimized for health checkups: We understand patients often prioritize convenience over clinical governance. Conversational Triage balances convenience with safety, featuring a hybrid interface that combines natural language and visual components optimized for patient triage.

For healthcare organizations looking to deploy such technology at scale, another key advantage is our commitment to taking responsibility for the information provided by this product, along with continuous clinical content updates. While Conversational Triage isn't yet a certified medical device we are dedicated to making it the first certified patient triage tool utilizing LLMs with Class IIb approval in the EU.

Compared to our existing products, Conversational Triage introduces several new possibilities, including:

Advanced symptom capturing: It better handles uncertain or vague symptom descriptions ("I feel kind of off" or "Something isn't right"), as well as lifestyle, environmental, or behavioral contexts ("This started after moving to a new apartment").
Asking questions: You can request more information, explanations, or clarifications during the conversation to improve understanding.
Adding details: You can easily add forgotten details or correct previous statements during the interaction, making the conversation feel more natural.
Personalized patient education: You can ask for additional explanations anytime, and the chatbot readily provides informative responses.
Empathetic experience: Conversational interfaces often feel more engaging and human-like, potentially fostering greater user trust.

We are actively working on additional enhancements, including voice support, new workflows, and features, so stay tuned for more.

https://a.storyblok.com/f/120667/1560x1100/19ffb86b24/blogpost-convesation-triage-image-3.png

What's the accuracy of Conversational Triage?

Conversational Triage utilizes the same clinical content that powers all our existing products. Our Medical Knowledge Base is continually updated following a rigorous medical content curation process.

To assess Conversational Triage’s accuracy, our clinical validation team conducted benchmarking tests using a set of 120 high-quality clinical vignettes. Conversational Triage performed on par with our existing symptom assessment tool, Symptomate, which is certified as a Class 1 Medical Device in the EU.

When compared to GPT-4o:

Conversational Triage achieved higher triage accuracy and provided a more balanced approach to over- and under-triage. GPT-4o demonstrated a 10 percentage points higher rate of over-triage in the considered benchmark.
Conversational Triage performed better than GPT-4o in triage-critical questioning, consistently asking all necessary questions for accurate assessment.
GPT-4o had the shortest interviews (averaging 7.1 questions vs 18.0 in Conversational Triage), which, while efficient, risked omitting key medical details.

As previously mentioned, we've also evaluated Conversational Triage through a usability study with over 17,000 real-world users, analyzed extensive feedback, and continue rolling out improvements. The latest version of the product shows completion rates approaching 60%, aligning with our typical symptom checker implementations.

Limitations of Conversation Triage

There are certain limitations we want to be transparent about.

Firstly, Conversational Triage is not yet a certified medical device. This means its usage within the EU is currently limited to participation in our usability study. We are crafting a roadmap towards obtaining Class IIb approval, but until that certification is obtained, Conversational Triage cannot be broadly deployed in the European Union.

Secondly, we're actively working to expand our medical ontology and accuracy of clinical understanding. While our current ontology provides robust support for symptom evaluation, we acknowledge the need for a broader context. We're developing enhancements to better capture complex situations, lifestyle factors, environmental contexts, and subtle symptoms that may impact health outcomes.

Lastly, while more languages will be supported later this year, Conversational Triage is currently available only in English.

Be among the first to leverage LLM-powered triage at scale

This marks the beginning of an exciting new chapter for Infermedica, and your feedback is greatly appreciated. Here's how you can help:

Try Conversational Triage yourself and share your feedback.
If your organization doesn't want to miss the opportunity of participating in a pilot, please reach out to me or get in touch with our team.*

* We are currently accepting pilot programs for select partners in the US, Asia, Middle East, Central America, New Zealand, and Africa. We’re happy to discuss with interested parties in the EU and any other geographies, in parallel to moving towards regulatory approvals in those regions.

If you liked my post, stay tuned for more as I share updates on what we’re working on.

Thank you!

BL/EN/2025/04/03/1