Chatbots Make Terrible Doctors, New Study Finds

XLE@piefed.social · 3 months ago

Chatbots Make Terrible Doctors, New Study Finds

SuspciousCarrot78@lemmy.world · edit-2 15 days ago

deleted by creator

realitista@lemmus.org · 3 months ago

Interesting. What technology are you using for this pipeline?

SuspciousCarrot78@lemmy.world · edit-2 15 days ago

deleted by creator

irate944@piefed.social · 3 months ago

I could’ve told you that for free, no need for a study

rudyharrelson@lemmy.radio · 3 months ago

People always say this on stories about “obvious” findings, but it’s important to have verifiable studies to cite in arguments for policy, law, etc. It’s kinda sad that it’s needed, but formal investigations are a big step up from just saying, “I’m pretty sure this technology is bullshit.”

I don’t need a formal study to tell me that drinking 12 cans of soda a day is bad for my health. But a study that’s been replicated by multiple independent groups makes it way easier to argue to a committee.

irate944@piefed.social · 3 months ago

Yeah you’re right, I was just making a joke.

But it does create some silly situations like you said

rudyharrelson@lemmy.radio · 3 months ago

I figured you were just being funny, but I’m feeling talkative today, lol

BillyClark@piefed.social · 3 months ago

it’s important to have verifiable studies to cite in arguments for policy, law, etc.

It’s also important to have for its own merit. Sometimes, people have strong intuitions about “obvious” things, and they’re completely wrong. Without science studying things, it’s “obvious” that the sun goes around the Earth, for example.

I don’t need a formal study to tell me that drinking 12 cans of soda a day is bad for my health.

Without those studies, you cannot know whether it’s bad for your health. You can assume it’s bad for your health. You can believe it’s bad for your health. But you cannot know. These aren’t bad assumptions or harmful beliefs, by the way. But the thing is, you simply cannot know without testing.

Slashme@lemmy.world · 3 months ago

Or how bad something is. “I don’t need a scientific study to tell me that looking at my phone before bed will make me sleep badly”, but the studies actually show that the effect is statistically robust but small.

In the same way, studies like this can make the distinction between different levels of advice and warning.

SuspciousCarrot78@lemmy.world · edit-2 15 days ago

deleted by creator

theunknownmuncher@lemmy.world · 3 months ago

A statistical model of language isn’t the same as medical training???

scarabic@lemmy.world · 3 months ago

It’s actually interesting. They found the LLMs gave the correct diagnosis high-90-something percent of the time if they had access to the notes doctors wrote about their symptoms. But when thrust into the room, cold, with patients, the LLMs couldn’t gather that symptom info themselves.

SuspciousCarrot78@lemmy.world · edit-2 15 days ago

deleted by creator

Hacksaw@lemmy.ca · 3 months ago

LLM gives correct answer when doctor writes it down first… Wowoweewow very nice!

tyler@programming.dev · 3 months ago

You have misunderstood what they said.

Hacksaw@lemmy.ca · 3 months ago

If you seriously think the doctor’s notes about the patient’s symptoms don’t include the doctor’s diagnostic instincts then I can’t help you.

The symptom questions ARE the diagnostic work. Your doctor doesn’t ask you every possible question. You show up and you say “my stomach hurts”. The Doctor asks questions to rule things out until there is only one likely diagnosis then they stop and prescribe you a solution if available. They don’t just ask a random set of questions. If you give the AI the notes JUST BEFORE the diagnosis and treatment it’s completely trivial to diagnose because the diagnostic work is already complete.

God you AI people literally don’t even understand what skill, craft, trade, and art are and you think you can emulate them with a text predictor.

SuspciousCarrot78@lemmy.world · edit-2 15 days ago

deleted by creator

softwarist@programming.dev · 3 months ago

As neither a chatbot nor a doctor, I have to assume that subarachnoid hemorrhage has something to do with bleeding a lot of spiders.

dandelion@lemmy.blahaj.zone · 3 months ago

https://en.wikipedia.org/wiki/Subarachnoid_hemorrhage

https://en.wikipedia.org/wiki/Arachnoid_mater

it is one of the protective membranes around the brain and spinal cord, and it is named after its resemblance to spider webs, so - close enough

End-Stage-Ligma@lemmy.world · 3 months ago

can confirm, this is where spiders live inside your body

also pee is stored in the balls

cub Gucci@lemmy.today · 3 months ago

I’m going to open it wide open to kill every spider in my body

BeigeAgenda@lemmy.ca · 3 months ago

Anyone who have knowledge about a specific subject says the same: LLM’S are constantly incorrect and hallucinate.

Everyone else thinks it looks right.

tyler@programming.dev · 3 months ago

That’s not what the study showed though. The LLMs were right over 98% of the time…when given the full situation by a “doctor”. It was normal people who didn’t know what was important that were trying to self diagnose that were the problem.

Hence why studies are incredibly important. Even with the text of the study right in front of you, you assumed something that the study did not come to the same conclusion of.

Elting@piefed.social · 3 months ago

So in order to get decent medical advice from an LLM you just need to be a doctor and tell it whats wrong with you.

tyler@programming.dev · 3 months ago

Yes, that was the conclusion.

zewm@lemmy.world · 3 months ago

It is insane to me how anyone can trust LLMs when their information is incorrect 90% of the time.

SuspciousCarrot78@lemmy.world · edit-2 15 days ago

deleted by creator

rumba@lemmy.zip · 3 months ago

Chatbots make terrible everything.

But an LLM properly trained on sufficient patient data metrics and outcomes in the hands of a decent doctor can cut through bias, catch things that might fall through the cracks and pack thousands of doctors worth of updated CME into a thing that can look at a case and go, you know, you might want to check for X. The right model can be fucking clutch at pointing out nearly invisible abnormalities on an xray.

You can’t ask an LLM trained on general bullshit to help you diagnose anything. You’ll end up with 32,000 Reddit posts worth of incompetence.

SuspciousCarrot78@lemmy.world · edit-2 15 days ago

deleted by creator

FelixCress@lemmy.world · 3 months ago

… You don’t say.

cub Gucci@lemmy.today · 3 months ago

“but have they tried Opus 4.6/ChatGPT 5.3? No? Then disregard the research, we’re on the exponential curve, nothing is relevant”

Sorry, I’ve opened reddit this week