Bill Gates feels Generative AI has plateaued, says GPT-5 will not be any better::The billionaire philanthropist in an interview with German newspaper Handelsblatt, shared his thoughts on Artificial general intelligence, climate change, and the scope of AI in the future.
You got it the wrong way around. We already have a ton of compute and what this kind of AI can do is pretty cool.
But adding more compute power and parameters won’t solve the inherent problems.
No matter what you do, it’s still just a text generator guessing the next best word. It doesn’t do real math or logic, it gets basic things wrong and hallucinates new fake facts.
Sure, it will get slightly better still, but not much. You can throw a million times the power at it and it will still fuck up in just the same ways.
If humans are any kind of yardstick here, I’d say all this is true of us too on many levels. The brain is a shortcut engine, not a brute force computer. It’s not solving equations to help you predict where that tennis ball will bounce next. It’s making guesses based on its corpus of past experience. Good enough guesses are frankly our brains’ bread and butter and most of us get through most days on little more than this.
It’s true that we can do more. Some of us, anyway. How many people actually exercise math and logic though? Sometimes it seems like… not a lot. And how many people hallucinate fake facts? A lot.
It’s much like evaluating self-driving cars. We may be tempted to say they’re just bloody awful, but so are human drivers.
I’d say the majority of humans know what 2 + 2 is. Chat GPT doesn’t. As it found the answer in some texts it will tell you 4, but all it takes is you telling it that’s wrong and suddenly it’s 5. So even for the most simple math problem it’s extremely easy to throw the whole thing off. Which also means for any prompt you put in it can go in wildly wrong directions at times.
And this is all with good input data, there’s plenty of trolls online and the data will only get worse (it already did, the original data up to 2021 was okayish, in the last year tons of crap was put out on top, some of it by Chat GPT itself. So the new model might input the crap it produced before, getting worse over time). The problem on top of that is that you don’t know the sources it used. If you ask about a recent event you might receive an insane answer it picked up from a right wing conspiracy site, you simply don’t know. There is no fact checking in place.
It’s a stunningly good text generator, but that’s all it is and it ever will be, at least until they do much more than just add more compute power to it.
Hehe. I’m imagining sitting 100 human test subjects down in a lab setting and asking them what 2+2 is, and then telling them they’re wrong when they answer 4. I don’t know how many of them would guess again but I know it’s not zero. Meanwhile, GPT can probably give a better answer to any advanced math or science query than the majority of humans.
I’m a writer and a language nerd and I watch people all the time use words incorrectly because they think they know what they mean, but they really don’t. They’re just regurgitating them in what they think is the same situation they heard them. They don’t “understand” the word and are just guessing and churning out crap.
I don’t have a dog in this race but I think it’s interesting how people judge artificial intelligence with too much credit given to what goes on with human intelligence. Most people who say it’s “just a next word predictor” read that phrase somewhere and are regurgitating it, not at all dissimilarly to what LLMs do. They use phrases like “it doesn’t actually understand” without being able to define, with any clarity or precision, and without resorting to examples, what would actually impress them as real intelligence.
Only if that answer is already out there and in the model. So pretty much a Google search away.
GPT isn’t coming up with new math or science facts (at least not real ones).
It literally is a word predictor, an insanely complex one, it’s the best way to describe it. If you start with layers, parameters and so on most people lose interest. But there’s some really good explanations around.
Generic AI (real AI) has internal logic, can learn and improve itself and can do self motivated actions. Chat GPT can tell you exactly how to create an account and order something from Amazon, but despite being able to put that text out it will never be able to actually follow them itself.
That’s not true. I wanted a vba script for Excel. I don’t know vba or excel so I spent hours searching Google for help. There were explanations of functions but no working code. I tried GPT for the fun of it and it spit back working code. Code that was nowhere on the Internet.
It was able to put together functions into working code based on the definition of functions, not simply cutting and pasting what somebody else had already written.
Again, pretty similar to the vast majority of humans. How many times in your science education did you learn ann equation that you’d already figured out on your own previously?
And to be fair, GPT doesn’t have hands and the ability to conduct experiments. So we have to, in a sense, judge its success on an apples to apples basis of what it, and we, do with the corpus of written knowledge.
In contrast to humans, GPT has at least read it all ;) (I say this in jest - I know it doesn’t have access to everything, but humans are too lazy to read, for the most part, even things they have access to).
This is a really good ELI5 explanation of its limit.
How is it a definitional limit on its intelligence that it can’t use interfaces designed for people with hands? You also cannot send an http request with your lips no matter how you try - that’s just not an interface made for you.
Bots can 100% operate websites and take online actions, conduct quality tests, write fake reviews. That doesn’t mean they are intelligent. I just can’t see how it has any bearing either way whether ChatGPT can place an Amazon order.
That would still give it too much credit in that case. It’s purely an input output system. You put text in (the prompt), you get text out (the result). If there is no input from you, there is no output. It doesn’t have any intrinsic functionality that runs on its own.
Maybe a bit too much for an ELI5.
This is short-sighted.
The jump to GPT 3.5 was preceded by the same general misunderstanding (we’ve reached the limit of what generative pre-trained transformers can do, we’ve reached diminishing returns, ECT.) and then a relatively small change (AFAIK it was a couple additional layers of transforms and a refinement of the training protocol) and suddenly it was displaying behaviors none of the experts expected.
Small changes will compound when factored over billions of nodes, that’s just how it goes. It’s just that nobody knows which changes will have that scale of impact, and what emergent qualities happen as a result.
It’s ok to say “we don’t know why this works” and also “there’s no reason to expect anything more from this methodology”. But I wouldn’t dismiss further improvements as a forgone possibility.
Another way to think of this is feedback from humans will refine results. If enough people tell it that Toronto is not the capital of Canada it will start biasing toward Ottawa, for example. I have a feeling this is behind the search engine roll out.
ChatGPT doesn’t learn like that though, does it? I thought it was “static” with its training data.
I was speculating about how you can overcome hallucinations, etc., by supplying additional training data. Not specific to ChatGPT or even LLMs…
You can finetune LLMs using smaller datasets, or with RLHF (reinforcement learning from human feedback) wherein people can give ratings to responses and the model can be either “rewarded” or “penalized” based off of the ratings for a given output. This retrains the LLM to produce outputs that people prefer.
Active Learning Models. Though public exposure can eaily fuck it up, without adult supervision. With proper supervision though, there’s promise.
So it will always have the biases of the supervisors
Bias is inevitable. Whether it is AI or any other knowledge based system. We just have to be cognizant of it and try to remedy it.
Toronto is Canadian New York. It wants to be the capital and probably should be but it doesn’t speak enough French.
This is exactly it. And it’s funny you’re getting downvoted.
We don’t truly know the depth of ML yet and how these general models could potential change when a few vectors in the equation change, and that’s the big unknown with it. I agree with you here that Gates’ opinion is just that and isn’t particularly well informed. Especially in comparison to what some of the industry and ML experts are saying about how far we can go with the models, how they will evolve as we change parameters/vectors/dependencies and the impact of that evolution on potential applications. It’s just too early.
I kinda get why I’m getting downvoted, honestly. The ChatGPT fanboys definitely give off an “NFT-grindset” kind of vibe, and they can be loud and overzealous with their prognosticating. It feels cathartic to make fun of the thing they’ve adopted as a centerpiece of their personality
None of that changes what is objectively the very real and very unexpected improvement these models are displaying, and we’re still not sure what it is they’re doing behind the curtain. “Predicting the next most likely word” is simply not a sufficient explanation for how these models seem to correctly interpret intent and apply factual knowledge stored in its dataset in abstract ways.
People want to squabble over anthropomorphic word choices and debate ‘consiousness’, and fair enough, its an interesting question. But that doesn’t really come close to what’s really interesting about the models gaining functionality when by all accounts they should only be ‘guessing the next most likely word’.
I’m not really interested in debating people who are performatively unimpressed by these products, but it bothers me that those people continue rolling their eyes when significant advancements are made. Like sure, it’s not new that ML algorithms can decode keystrokes from an audio recording, but it’s a big deal when those models can be run on consumer grade hardware and not just a super computer run by a three letter agency.
I mean, that’s more-or-less what I said. We don’t know the theoretical limits of how good that text generation is when throwing more compute at it and adding parameters for the context window. Can it generate a whole book that is fairly convincing, write legal briefs off of the sum of human legal knowledge, etc.? Ultimately, the algorithm is the same, so like you said, the same problems persist, and the definition of “better” is wishy-washy.
It will obviously get even better, but you’ll never be able to rely on it. Sure, 99.9% of that generated legal document will look perfect, till you overlook one sentence where the AI hallucinated. There is no fact checking in there, that’s the issue.