Here is the full text of AI researcher Janelle Shane’s talk titled “The Danger of AI is Weirder Than You Think” at TED conference.
Janelle Shane – TED Talk TRANSCRIPT
So, artificial intelligence is known for disrupting all kinds of industries.
What about ice cream? What kind of mind-blowing new flavors could we generate with the power of an advanced artificial intelligence?
So I teamed up with a group of coders from Kealing Middle School to find out the answer to this question. They collected over 1,600 existing ice cream flavors, and together, we fed them to an algorithm to see what it would generate. And here are some of the flavors that the AI came up with.
[Pumpkin Trash Break]
[Peanut Butter Slime]
[Strawberry Cream Disease]
These flavors are not delicious, as we might have hoped they would be.
So the question is: What happened? What went wrong? Is the AI trying to kill us? Or is it trying to do what we asked, and there was a problem?
In movies, when something goes wrong with AI, it’s usually because the AI has decided that it doesn’t want to obey the humans anymore, and it’s got its own goals, thank you very much.
In real life, though, the AI that we actually have is not nearly smart enough for that. It has the approximate computing power of an earthworm, or maybe at most a single honeybee, and actually, probably maybe less. Like, we’re constantly learning new things about brains that make it clear how much our AIs don’t measure up to real brains.
So today’s AI can do a task like identify a pedestrian in a picture, but it doesn’t have a concept of what the pedestrian is beyond that it’s a collection of lines and textures and things. It doesn’t know what a human actually is.
So will today’s AI do what we ask it to do? It will if it can, but it might not do what we actually want.
So let’s say that you were trying to get an AI to take this collection of robot parts and assemble them into some kind of robot to get from Point A to Point B.
Now, if you were going to try and solve this problem by writing a traditional-style computer program, you would give the program step-by-step instructions on how to take these parts, how to assemble them into a robot with legs and then how to use those legs to walk to Point B.
But when you’re using AI to solve the problem, it goes differently. You don’t tell it how to solve the problem, you just give it the goal, and it has to figure out for itself via trial and error how to reach that goal. And it turns out that the way AI tends to solve this particular problem is by doing this: it assembles itself into a tower and then falls over and lands at Point B.
And technically, this solves the problem. Technically, it got to Point B. The danger of AI is not that it’s going to rebel against us, it’s that it’s going to do exactly what we ask it to do.
So then the trick of working with AI becomes: How do we set up the problem so that it actually does what we want?
So this little robot here is being controlled by an AI. The AI came up with a design for the robot legs and then figured out how to use them to get past all these obstacles. But when David Ha set up this experiment, he had to set it up with very, very strict limits on how big the AI was allowed to make the legs, because otherwise …
And technically, it got to the end of that obstacle course. So you see how hard it is to get AI to do something as simple as just walk.
So seeing the AI do this, you may say, OK, no fair, you can’t just be a tall tower and fall over, you have to actually, like, use legs to walk. And it turns out, that doesn’t always work, either. This AI’s job was to move fast. They didn’t tell it that it had to run facing forward or that it couldn’t use its arms. So this is what you get when you train AI to move fast, you get things like somersaulting and silly walks. It’s really common. So is twitching along the floor in a heap.
So in my opinion, you know what should have been a whole lot weirder is the “Terminator” robots. Hacking “The Matrix” is another thing that AI will do if you give it a chance.
So if you train an AI in a simulation, it will learn how to do things like hack into the simulation’s math errors and harvest them for energy. Or it will figure out how to move faster by glitching repeatedly into the floor.
When you’re working with AI, it’s less like working with another human and a lot more like working with some kind of weird force of nature. And it’s really easy to accidentally give AI the wrong problem to solve, and often we don’t realize that until something has actually gone wrong.
So here’s an experiment I did, where I wanted the AI to copy paint colors, to invent new paint colors, given the list like the ones here on the left. And here’s what the AI actually came up with.
[Sindis Poop, Turdly, Suffer, Gray Pubic]
So technically, it did what I asked it to. I thought I was asking it for, like, nice paint color names, but what I was actually asking it to do was just imitate the kinds of letter combinations that it had seen in the original. And I didn’t tell it anything about what words mean, or that there are maybe some words that it should avoid using in these paint colors. So its entire world is the data that I gave it. Like with the ice cream flavors, it doesn’t know about anything else.
So it is through the data that we often accidentally tell AI to do the wrong thing. This is a fish called a tench. And there was a group of researchers who trained an AI to identify this tench in pictures.
But then when they asked it what part of the picture it was actually using to identify the fish, here’s what it highlighted. Yes, those are human fingers.
Why would it be looking for human fingers if it’s trying to identify a fish? Well, it turns out that the tench is a trophy fish, and so in a lot of pictures that the AI had seen of this fish during training, the fish looked like this.