OpenAI’s most powerful AI model to date, can tell jokes and write bar exams – but can it do damage?
OpenAI’s latest release, GPT-4, is the most powerful and impressive AI model to date from the company behind ChatGPT and the Dall-E AI artist. The system can pass the bar exam, solve logic puzzles and even give you a recipe to use up leftovers based on a photo of your refrigerator – but its creators warn that it can also spread fake facts, embed dangerous ideologies and even trick people into performing tasks on its behalf. Here’s what you need to know about our newest AI overlord.
What is GPT-4?
GPT-4 is essentially a text creation machine. But it is a very good one, and to be very good at creating text turns out to be practically comparable to being very good at understanding and reasoning about the world.
And so if you give GPT-4 a question from an American bar exam, it will write an essay that demonstrates legal knowledge; if you give it a medicinal molecule and ask for variations, it seems to apply biochemical expertise; and if you ask it to tell you a joke about a fish, it seems to have a sense of humor – or at least a good memory for bad cracker jokes (“what do you get when you cross a fish and an elephant? Swimming trunks!”).
Is it the same as ChatGPT?
Virtually. If ChatGPT is the car, then GPT-4 is the engine: a powerful general technology that can be molded into a number of different applications. You may have already experienced it, as it has powered Microsoft’s Bing Chat – the one that went a little crazy and threatened to destroy people – for the past five weeks.
But GPT-4 can be used to power more than chatbots. Duolingo has built a version of it into its language learning app that can explain where students have gone wrong, rather than simply telling them the right thing to say; Stripe is using the tool to monitor its chat room for scammers; and assistive technology company Be My Eyes is using a new feature, image input, to build a tool that can describe the world to a blind person and answer follow-up questions about it.
What makes GPT-4 better than the old version?
On a wide range of technical challenges, GPT-4 outperforms its older siblings. It can answer math questions better, is tricked into giving false answers less often, can score reasonably high on standardized tests – but not those on English literature, where it sits comfortably in the bottom half of the rankings – and so on.
It also has a sense of ethics more firmly built into the system than the old version: ChatGPT took its original engine, GPT-3.5, and added filters to try to prevent it from answering malicious or harmful questions. Now those filters are built directly into GPT-4, which means the system will politely refuse to perform tasks such as ranking races by attractiveness, telling sexist jokes or providing guidelines for synthesizing sarin.
So GPT-4 can’t do any damage?
OpenAI certainly tried to achieve that. The company released a lengthy article with examples of damage GPT-3 could cause against which GPT-4 has defenses. It even gave an early version of the system to outside researchers at the Alignment Research Center, who were trying to see if they could get GPT-4 to play the role of a malicious AI from the movies.
It failed at most of those tasks: it was unable to describe how it would replicate itself, acquire more computer resources or carry out a phishing attack. But the researchers managed to simulate it using Taskrabbit to convince a human employee to pass a “are you human” test, even working out that the AI system had to lie to the employee and say it was a blind person who cannot see the images. (It is unclear whether the experiment involved an actual Taskrabbit worker).
But some worry that the better you teach an AI system the rules, the better you teach that same system how to break them. It is called the “Waluigi effect” and seems to be the result of the fact that while understanding the full details of what constitutes ethical action is difficult and complex, the answer to “should I be ethical?” is a much simpler yes-or-no question. Misguide the system to decide not to be ethical and it will cheerfully do whatever is asked of it.