Skip to content
Home » Fei-Fei Li: Spatial Intelligence is the Next Frontier in AI (Transcript)

Fei-Fei Li: Spatial Intelligence is the Next Frontier in AI (Transcript)

Read the full transcript of godmother of AI Dr. Fei-Fei Li in conversation with Diana Hu of Y Combinator on June 16, 2025 at AI Startup School in San Francisco.

Opening Remarks

DR. FEI-FEI LI: My entire career is going after problems that are just so hard, bordering delusional. To me AGI will not be complete without spatial intelligence, and I want to solve that problem. I just love being an entrepreneur. Forget about what you have done in the past, forget about what others think of you. Just hunker down and build. That is my comfort zone.

DIANA HU: So I’m super excited here to have Dr. Fei-Fei Li. She has such a long career in AI. I’m sure a lot of you know her, right? Raise your hand.

DR. FEI-FEI LI: I know you too.

DIANA HU: She’s been named the godmother of AI. One of the first projects that you created was ImageNet in 2009, 16 years ago.

DR. FEI-FEI LI: Oh my God, don’t remind me of that now.

DIANA HU: It has over 80,000 citations and it really kicked off one of the legs of stools for AI, which is the data problem. Tell us about how that project came about. It was pretty pioneering work back then.

The Birth of ImageNet

DR. FEI-FEI LI: Yeah. Well, first of all, Diana and Gary and everybody, thanks for inviting me here. I’m so excited to be here because I feel like I’m just one of you. I’m also an entrepreneur right now. Just started a small company. So very excited to be here.

ImageNet was… Yeah, you’re right. We actually conceived that almost 18 years ago. Time really flies. I was a first year assistant professor at Princeton. Oh, wow. Hi. Hi. Tigers. Yeah.

And the world of AI and machine learning was so different at that time. There was very little data. Algorithms, at least in computer vision, did not work. There was no industry. As far as the public was concerned, the word AI doesn’t exist.

But there is still a group of us, starting from the founding fathers of AI, John McCarthy, and then we go through people like Geoff Hinton. I think we just had an AI dream. We really, really want to make machines to think and to work with that dream.

My own personal dream was to make machines see, because seeing is such a cornerstone of intelligence. Visual intelligence is not just perceiving, it’s really understanding the world and do things in the world. So I was obsessed with the problem of making machine see.

And as I was obsessively developing machine learning algorithms at that time, we did try neural network, but it didn’t work. We pivoted to Bayesian networks to support vector machines, whatever it was. But one problem always haunted me, and it was the problem of generalization.

If you’re working in machine learning, you have to respect that generalization is the core mathematical foundation or goal of machine learning. In order to generalize these algorithms, these data. Yet no one had data at that time in computer vision.

And I was the first generation of grad student who was starting to dabble into data because I was the first generation of graduate student who saw the Internet, the big Internet of things.

So fast forward around 2007ish, my student and I decided that we have to take a bold bet. We have to bet that there needs to be a paradigm shift in machine learning. And that paradigm shift has to be led by data driven methods. And there was no data.

So we’re like, okay, let’s go to the Internet, download a billion images, that’s the highest number we could get on the Internet and then just create the world’s, the entire world’s visual taxonomy. And we use that to train and benchmark machine learning algorithm. And that was why ImageNet was conceived and came to life.

The AlexNet Breakthrough

DIANA HU: And it took a while until there were algorithms that were promising. It wasn’t until 2012 when AlexNet came out. And that was the second part of the equation with getting to AI was getting the compute and throwing enough at it and algorithms. Tell us about what was that moment where you started to see, oh, you seeded it with data. And now people started, the community started to figure more things out for AI, right?

DR. FEI-FEI LI: So between 2009 we published this tiny little CVPR poster. In 2009-2012, AlexNet, that there were three years that we really believe that data will drive AI, but we had very little signal in terms of if that was working.

So we did a couple of things. One is we open sourced, we believed from the get go we have to open source this to the entire research community for everybody to work on this. The other thing we did is we created a challenge because we want the whole world’s smartest students and researchers to work on this problem. So that was what we call the ImageNet Challenge.

So every year we release a testing data set. Well, the whole ImageNet is there for training, but we release testing and then we invite everybody openly to participate. And then the first couple of years was really setting the baseline. You know, the performance was in a 30% error rate. It wasn’t zero or I mean it wasn’t completely random, but it wasn’t that great.

But the third year, 2012, I wrote this in a book that I published, but I still remember it was around the end of summer that we were taking all the results of ImageNet challenge and running it on our servers. And I remember it was late night one day I got a ping from my graduate student. I was home and said, we got a result that really, really stand out and you should take a look.

And we looked into it. It was convolutional neural network. It wasn’t called AlexNet at that time. That team, that Geoff Hinton team, was called Super Vision.