I’m sure it’s going to get some hate, but this actually makes some weird amount of sense. Modern LLMs are basically a glorified search engine, so as long as all the relevant factors were included in its corpus, I could see it doing very well with more information in its “memory” than a human MD can hold.
It certainly makes more sense than “AIs can do math better than a grad student now!*” *Disclaimer, they actually cannot
Actual article here is worth a read because I find when newspapers write about these articles, they miss the point hard.
This actually has nothing to do with “memory”, but instead reading text. They studied 52 doctors responses to standardized (read publicly available online) cases written in front of them. Half got access to LLM. Neither group was significantly different.
Then they ran 3 trials solely with LLM and find that these were significantly better.
My thoughts:
1: Terribly small sample size overall, but would like to see more LLM numbers.
2: The primary purpose of this study was to explore if doctors are better with LLMs helping them. We’re not, and the authors discuss a very good point of “prompts matter”.
3: As is always my gripe with these kinds of things, written text translates to real patients extremely poorly. A computerized text interface is better at handling and responding to text patients. Human doctors are still better at treating human patients.
They studied 52 doctors responses to standardized (read publicly available online) cases written in front of them.
…
Then they ran 3 trials solely with LLM and find that these were significantly better.
How do they know that the answers to the “standardized” publically available case studies were not in the training data of the LLM? Isn’t it extremely likely that they were?
It’s very likely that they were in the training data. I forgot to include that as a point. Unfortunately, though, that’s a very difficult variable to control in the LLM research.
I’m sure it’s going to get some hate, but this actually makes some weird amount of sense. Modern LLMs are basically a glorified search engine, so as long as all the relevant factors were included in its corpus, I could see it doing very well with more information in its “memory” than a human MD can hold.
It certainly makes more sense than “AIs can do math better than a grad student now!*” *Disclaimer, they actually cannot
Also, current LLMs are great at modelling best practices.
Most disease diagnosis, even rare diseases, follows predictable paths. Human doctors would have to have superhuman memories to do as well.
What’s more exciting to me is that this knowledge is now free. Free as in beer.
People talk of UBI, but what about universal services that cost nothing?
UBI and UBS aren’t opposites. UBI is the last level of UBS.
Actual article here is worth a read because I find when newspapers write about these articles, they miss the point hard.
This actually has nothing to do with “memory”, but instead reading text. They studied 52 doctors responses to standardized (read publicly available online) cases written in front of them. Half got access to LLM. Neither group was significantly different.
Then they ran 3 trials solely with LLM and find that these were significantly better.
My thoughts: 1: Terribly small sample size overall, but would like to see more LLM numbers. 2: The primary purpose of this study was to explore if doctors are better with LLMs helping them. We’re not, and the authors discuss a very good point of “prompts matter”. 3: As is always my gripe with these kinds of things, written text translates to real patients extremely poorly. A computerized text interface is better at handling and responding to text patients. Human doctors are still better at treating human patients.
How do they know that the answers to the “standardized” publically available case studies were not in the training data of the LLM? Isn’t it extremely likely that they were?
It’s very likely that they were in the training data. I forgot to include that as a point. Unfortunately, though, that’s a very difficult variable to control in the LLM research.