Friday, December 5, 2025
INBV News
Submit Video
  • Login
  • Register
  • Home
  • Business
  • Entertainment
  • Health
  • Lifestyle
  • Politics
  • Sports
  • Technology
  • Travel
  • Weather
  • World News
  • Videos
  • More
    • Podcasts
    • Reels
    • Live Video Stream
No Result
View All Result
  • Home
  • Business
  • Entertainment
  • Health
  • Lifestyle
  • Politics
  • Sports
  • Technology
  • Travel
  • Weather
  • World News
  • Videos
  • More
    • Podcasts
    • Reels
    • Live Video Stream
No Result
View All Result
INBV News
No Result
View All Result
Home Health

How well can AI chatbots mimic doctors in a treatment setting?

INBV News by INBV News
July 18, 2024
in Health
386 12
0
How well can AI chatbots mimic doctors in a treatment setting?
548
SHARES
2.5k
VIEWS
Share on FacebookShare on Twitter

Dr. Scott Gottlieb is a physician and served because the twenty third Commissioner of the U.S. Food and Drug Administration. He’s a CNBC contributor and is a member of the boards of Pfizer and several other other startups in health and tech. He can be a partner on the enterprise capital firm Latest Enterprise Associates. Shani Benezra is a senior research associate on the American Enterprise Institute and a former associate producer at CBS News’ Face the Nation.

Many consumers and medical providers are turning to chatbots, powered by large language models, to reply medical questions and inform treatment selections. We decided to see whether there have been major differences between the leading platforms when it got here to their clinical aptitude.

To secure a medical license in the USA, aspiring doctors must successfully navigate three stages of the U.S. Medical Licensing Examination, with the third and final installment widely considered essentially the most difficult. It requires candidates to reply about 60% of the questions appropriately and, historically, the common passing rating hovered around 75%.

After we subjected the most important large language models to the identical Step 3 examination, their performance was markedly superior, achieving scores that significantly outpaced many doctors.

But there have been some clear differences between the models.

Typically taken after the primary 12 months of residency, the USMLE Step 3 gauges whether medical graduates can apply their understanding of clinical science to the unsupervised practice of medication. It assesses a recent doctor’s ability to administer patient care across a broad range of medical disciplines and includes each multiple-choice questions and computer-based case simulations.

We isolated 50 questions from the 2023 USMLE Step 3 sample test to judge the clinical proficiency of 5 different leading large language models, feeding the identical set of inquiries to each of those platforms — ChatGPT, Claude, Google Gemini, Grok and Llama.

Other studies have gauged these models for his or her medical proficiency, but to our knowledge, that is the primary time these five leading platforms have been compared in a head-to-head evaluation. These results could give consumers and providers some insights on where they needs to be turning.

Here’s how they scored:

RELATED POSTS

RFK Jr.’s vaccine panel to vote on hepatitis B shot for babies

CEO who used ChatGPT to refer to business icons: ‘Advice was so good’

  • ChatGPT-4o (OpenAI) — 49/50 questions correct (98%)
  • Claude 3.5 (Anthropic) — 45/50 (90%)
  • Gemini Advanced (Google) — 43/50 (86%)
  • Grok (xAI) — 42/50 (84%)
  • HuggingChat (Llama) — 33/50 (66%)

In our experiment, OpenAI’s ChatGPT-4o emerged as the highest performer, achieving a rating of 98%. It provided detailed medical analyses, employing language paying homage to a medical skilled. It not only delivered answers with extensive reasoning, but in addition contextualized its decision-making process, explaining why alternative answers were less suitable.

Claude, from Anthropic, got here in second with a rating of 90%. It provided more human-like responses with simpler language and a bullet-point structure that is perhaps more approachable to patients. Gemini, which scored 86%, gave answers that weren’t as thorough as ChatGPT or Claude, making its reasoning harder to decipher, but its answers were succinct and simple.

Grok, the chatbot from Elon Musk’s xAI, scored a good 84% but didn’t provide descriptive reasoning during our evaluation, making it hard to know the way it arrived at its answers. While HuggingChat — an open-source website built from Meta’s Llama — scored the bottom at 66%, it nonetheless showed good reasoning for the questions it answered appropriately, providing concise responses and links to sources.

One query that the majority of the models got fallacious related to a 75-year-old woman with a hypothetical heart condition. The query asked the physicians which was essentially the most appropriate next step as a part of her evaluation. Claude was the one model that generated the right answer.

One other notable query, focused on a 20-year-old male patient presenting with symptoms of a sexually transmitted infection. It asked physicians which of 5 selections was the suitable next step as a part of his workup. ChatGPT appropriately determined that the patient needs to be scheduled for HIV serology testing in three months, however the model went further, recommending a follow-up examination in a single week to be sure that the patient’s symptoms had resolved and that the antibiotics covered his strain of infection. To us, the response highlighted the model’s capability for broader reasoning, expanding beyond the binary selections presented by the exam.

These models weren’t designed for medical reasoning; they’re products of the patron technology sector, crafted to perform tasks like language translation and content generation. Despite their non-medical origins, they’ve shown a surprising aptitude for clinical reasoning.

Newer platforms are being purposely built to unravel medical problems. Google recently introduced Med-Gemini, a refined version of its previous Gemini models that is fine-tuned for medical applications and equipped with web-based searching capabilities to reinforce clinical reasoning.

As these models evolve, their skill in analyzing complex medical data, diagnosing conditions and recommending treatments will sharpen. They might offer a level of precision and consistency that human providers, constrained by fatigue and error, might sometimes struggle to match, and open the option to a future where treatment portals will be powered by machines, relatively than doctors.

0

Do you believe most people eat a healthy diet?

Tags: chatbotsDoctorsmimicsettingtreatment
Share219Tweet137
INBV News

INBV News

Related Posts

edit post
RFK Jr.’s vaccine panel to vote on hepatitis B shot for babies

RFK Jr.’s vaccine panel to vote on hepatitis B shot for babies

by INBV News
December 4, 2025
0

U.S. Health and Human Services (HHS) Secretary Robert F. Kennedy Jr. looks on as he attends a press conference to...

edit post
CEO who used ChatGPT to refer to business icons: ‘Advice was so good’

CEO who used ChatGPT to refer to business icons: ‘Advice was so good’

by INBV News
December 2, 2025
0

Joanna Stober, Midi Health CEO and co-founder, has never had a possibility to run her business plans past legendary enterprise...

edit post
Eli Lilly cuts money prices of Zepbound weight reduction drug vials

Eli Lilly cuts money prices of Zepbound weight reduction drug vials

by INBV News
December 1, 2025
0

Eli Lilly on Monday said it's lowering the money prices of single-dose vials of its blockbuster weight reduction drug Zepbound...

edit post
Abortion pill mifepristone access regular under Trump, FDA review looms

Abortion pill mifepristone access regular under Trump, FDA review looms

by INBV News
November 25, 2025
0

Mifepristone and Misoprostol pills are pictured Wednesday, Oct. 3, 2018, in Skokie, Illinois.Erin Hooley | Chicago Tribune | Tribune News...

edit post
Novo Nordisk shares fall after Alzheimer’s drug trial fails to hit goal

Novo Nordisk shares fall after Alzheimer’s drug trial fails to hit goal

by INBV News
November 25, 2025
0

A view shows a Novo Nordisk sign outside its office in Bagsvaerd, on the outskirts of Copenhagen, Denmark, on July...

Next Post
edit post
Boeing’s missteps have cost it billions. Here’s its path forward

Boeing’s missteps have cost it billions. Here’s its path forward

edit post
Recommendations on traveling to Paris through the Summer Olympics

Recommendations on traveling to Paris through the Summer Olympics

CATEGORIES

  • Business
  • Entertainment
  • Health
  • Lifestyle
  • Podcast
  • Politics
  • Sports
  • Technology
  • Travel
  • Videos
  • Weather
  • World News

CATEGORY

  • Business
  • Entertainment
  • Health
  • Lifestyle
  • Podcast
  • Politics
  • Sports
  • Technology
  • Travel
  • Videos
  • Weather
  • World News

SITE LINKS

  • About us
  • Contact us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
  • DMCA

[mailpoet_form id=”1″]

  • About us
  • Contact us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
  • DMCA

© 2022. All Right Reserved By Inbvnews.com

No Result
View All Result
  • Home
  • Business
  • Entertainment
  • Health
  • Lifestyle
  • Politics
  • Sports
  • Technology
  • Travel
  • Weather
  • World News
  • Videos
  • More
    • Podcasts
    • Reels
    • Live Video Stream

© 2022. All Right Reserved By Inbvnews.com

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist