Discussion about this post

User's avatar
SkinShallow's avatar

I'm assuming that you DID check whether the studies the LLM found/"found" used for its answers actually exist and the numbers in its model are in them?

My biggest practical problem with them (amazing as they are for *manipulation of symbols/words within their universe*) is their disconnect from reality; ie that they not only present data or sources that don't exist but have no idea that they are doing it. For me it happens ALL THE TIME, hallucinated quotes, papers and whole books, especially when there's nothing there in the world that would satisfy my query; but oddly, also even when there's no need.

So a tool that's uncannily, amazingly good at, for example, unpacking my stream of consciousness in a pseudo-therapeutic way (tho tends to fall into a infuriatingly "validating" mode unless it's corrected frequently) or drafting a rough outline of a piece of writing based on my ideas I "dictate", supplementing it with some extra content REPEATEDLY produces fake quotes from a text that exists in various version in public domain, or invents a one out of five sources. And when I present it with a thesis it will almost never argue against/challenge it, often hallucinating information in support.

I'm not even saying that they are not edging towards AGI. I'm saying that they have a massive "knowledge" problem. So, developing an ability to reason (or it's functional simulacrum) is impressive; but useful reasoning is truth conditional. Valid arguments are pretty, but we need sound ones. And models that confidently produce valid arguments that are based on false premises are a big problem. It's possible that the problem can be very easily rectified by directly feeding the model true data, of course.

Expand full comment
A Pox on Both Your Houses.'s avatar

The Village was stunning for me to see. I need something like this to do my own version of forecasting research. I know you have written about aggregation of "expert" opinions/forecasts. I have been following the research comparing human experts with actuarial/algorithmic approaches for decades. Have read all the Good Judgement/Tetlock clan stuff. Interesting how some LLMs use MOE (Mixture of Experts). I am trying to do something in this area that nobody has done. Agents like this will help. I plan to clone manifold.markets as part of my project. Part of the project will test if people's predictions can be improved in a Real World setting. That's where the altruism comes in: saving people money by teaching them how bad the "experts" are and showing them how to do better. I'm a retired Psychologist. I want to publish something about this. If you were interested, I'd appreciate your collaboration, and how to contact you.

Expand full comment
5 more comments...

No posts