Read the full transcript of AI and semiconductor expert Dylan Patel’s interview on Invest Like The Best podcast with host Patrick O’Shaughnessy, “Inside the Trillion-Dollar AI Buildout”, Sep 30, 2025.
The OpenAI-Nvidia-Oracle Triangle
PATRICK O’SHAUGHNESSY: I was going to lay out this idea of going through the past, present and future of compute as the big idea for our conversation, but since it just happened, I don’t think I’ve heard you talk about it anywhere. I’d love to start by asking about this whole OpenAI Nvidia thing, which sounds exciting, seems vague, not really sure what’s going on. Maybe you could explain it to us as you see it and what the strategic implications are of the big announcement.
DYLAN PATEL: All right, so I think it’s very, very simple, right? You’ve got OpenAI paying Oracle lots of money. You’ve got Oracle paying Nvidia lots of money. You’ve got Nvidia paying OpenAI lots of money.
PATRICK O’SHAUGHNESSY: Spider-Man meme.
DYLAN PATEL: We’ve got the infinite money glitch here. No, no, no, that’s not actually what’s happening. Right. What’s really happening is OpenAI has an insatiable demand for compute. The compute precedes the buildup of business. You have to have the cluster before you can rent it out for inference, or rather run models on it for inference. You have to have the cluster to train the model that’s good enough that it unlocks new use cases which then can be adopted. And there’s an adoption curve there for any new use case.
So you have to have all these things sequenced. Given this is a game of the richest people in the world, or rather the biggest tech giants in the world, right? It’s Zuck, it’s Google, you know, Larry and Sergey, or Sergey is constantly in the business now again, right? It’s all the biggest people in the world.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: If they don’t move fast enough and if they don’t have the most compute or among the most compute, they will get beaten. The magic of OpenAI was that they just spent way more compute on a single model run on GPT-3 and 4. And they had the foresight and the vision and the execution.
PATRICK O’SHAUGHNESSY: Yeah.
The Capital Arms Race
DYLAN PATEL: But they made that bet and they were able to secure it. And at the time it was a few hundred million dollars, whatever. Right. You know, that’s a ton of money. But now it’s sort of like, well, Mark Zuckerberg sees how much compute he’s going to have to get, even though he has this insane cash flow that he’s like, “Oh, wait, I need to go sign a deal with Apollo for $30 billion on this data center right in Louisiana, this mega data center I’m going to build.” It’s like, “Wait, why didn’t you just fund this with cash flows? You have so much cash flow.” It’s like, because my plans, that’s just the physical data center. Now what am I to put in it? It is so much money.
The amount of capital that people are going to have and are dumping into this is insane. Right. Google was slow to wake up and then they were slow to pivot their data center operations, or slow to do everything. And so while they could have way more compute than anyone by a humongous degree, they haven’t been able to deploy it as fast. So OpenAI is still on the curve of, and then they have how much they allocate to search. And, you know, generative search is not really necessarily competing with OpenAI. Right. It’s the mega models.
So if you have this tremendous vision of what’s going to happen with AI, you know that it takes a ton of compute to build them. You know, pretty much the amount of compute you could dedicate to these models is limitless. And they will get better. Now it’s a log-log scale, right. That is, you need 10x more compute to get to the next tier of performance.
You might think of it as diminishing returns, but what if the next tier of performance is a 6-year-old versus a 16-year-old? Child labor is quite effective versus a 6-year-old. You can’t get to do much. And this is not exactly the way to think of AI, but this is the conundrum that OpenAI is in. Right. They have to get more compute than anyone, or at least among them. They have to race with the giants. These giants are trillion-dollar businesses.
OpenAI’s Strategic Partnerships
So how does OpenAI get there? Well, it’s partnering with Microsoft. Well, that soured some, right? It’s partnering with Oracle. Well, Oracle can do a lot, but Oracle doesn’t even have a balance sheet like Google and Microsoft and Amazon and Meta, Elon, et cetera. Right.
PATRICK O’SHAUGHNESSY: Court of kings.
DYLAN PATEL: Yeah, this is very much the Pascalian wager nature of all of this with the tech giants. Oracle can be part of it, but OpenAI needs allies, right? They need people to effectively spend the capex ahead of the curve and trust that they’ll be able to pay the rental income because that’s what it is.
At the end of the day, OpenAI is committing to 5-year deals. These 5-year deals cost X amount of money. It’s $10 to $15 billion per gigawatt of data center capacity that you pay a year. And then that $10 to $15 billion for a gigawatt of data center capacity, you’re paying that for five years. Okay, that’s $50 to $75 billion of cash that goes out the door to OpenAI for 1 gigawatt of capacity.
And you talk about what Sam’s saying is, “Hey, I need 10 gigawatts, I need more than 10 gigawatts,” right? Then you end up with this really challenging aspect of how do you pay for that? And hey, that’s only the rental price. If I were to actually do the capex, or if I were to, because it’s front-loaded, right? It becomes who is the balance sheet for this?
Yeah, that’s the reason these deals are coming about. And so Oracle is making a massive bet, right? Larry, you know, hey, he’s getting good margin off of it, but he’s making a massive bet that this capex that he’s going to pay for OpenAI will actually be paid because, you know, he signed a $300 billion deal with OpenAI.
PATRICK O’SHAUGHNESSY: It’s like, where’s that going to come from?
DYLAN PATEL: Yeah, it was like your revenue is $15 billion ARR this month maybe, right? On a run rate basis it’ll get to $20 billion by the end of the year. Pretty clearly it’s $16 billion now. But how do you pay $300 billion of revenue? Now if the bet works out, they’ve just made $100 billion of profit, right? Just pure cash profit. It’s crazy. But if it doesn’t work out, they’ve got this huge, and they’re starting to raise debt. Right. There was a small deal they signed recently, but they’re going to start raising more and more debt.
Nvidia’s Strategic Position
Yeah. So this game and now Nvidia’s kind of got the same conundrum, right? It’s like, well, Google and Amazon are doing these deals, whether it’s to other vendors for TPUs or for Trainium, whether it’s Anthropic or others. They’re trying to court OpenAI, they’re trying to court other companies. How do I get into this game? Right, okay, fine. I can rely on Microsoft somewhat, I can rely on Oracle somewhat.
But at the end of the day, GPUs, if I want GPUs to be king, part of it is just my chip is the best, but part of it is also who’s going to pay the capex up front. Google and Amazon will pay the capex up front if it’s for TPUs or Trainiums. They won’t pay the capex up front necessarily for that same capacity of GPU. So you’ve got this challenging aspect. And so that’s where this Nvidia and OpenAI deal comes from.
PATRICK O’SHAUGHNESSY: I want to dig into the underlying assumptions driving this on the training and inference side because obviously there’s the willingness. Zuckerberg just needs to go down the hall to his CFO to get access to all this capital. He doesn’t even need to go down the hall.
DYLAN PATEL: He can just make it so. He’s got the voting shares.
PATRICK O’SHAUGHNESSY: Sam’s got to fly to Norway and, you know, Saudi and other places and we’re at that tier of capital.
DYLAN PATEL: You’re making it sound way easier than it is.
PATRICK O’SHAUGHNESSY: I don’t mean to. I’m just saying, you know, Zuckerberg is…
DYLAN PATEL: Hold on. If it’s this easy, let’s raise $100 billion, dude. We should do it. We can compete.
Understanding Scaling Laws and Returns
PATRICK O’SHAUGHNESSY: But I want to make sure I understand your thinking on the underlying two sides of this. One, which is your view on the diminishing return curve on just the return on this.
DYLAN PATEL: I don’t think it’s a diminishing return. Right. I think that’s important to recognize. Right.
PATRICK O’SHAUGHNESSY: Start there. I’m going to ask about inference too, and the growth and inference token demand.
DYLAN PATEL: But given it’s a log-log chart, right, scaling laws are, right, given there’s no model architecture improvements, you just throw more compute, data, model size at it, it gets better at this pace.
PATRICK O’SHAUGHNESSY: But you’re confident that that will continue?
DYLAN PATEL: I think everything has shown that it will continue and it’s continued over more than 10 orders of magnitude.
PATRICK O’SHAUGHNESSY: Wasn’t some…
DYLAN PATEL: Well, GPT-5 is not necessarily that much bigger than 4.0. Right. And 4.0 is smaller than 4. What’s changing is sort of the paradigm of how you spend the compute. And also, if they made a bigger model, could they even serve it now? Right? They did 4.5 and it was terrible. No one could serve it. It was actually quite a bit smarter, but they couldn’t actually serve it at any reasonable cost and speed.
This is why Anthropic has the same issue, right? Or I wouldn’t even call it an issue. But all of their revenue comes from Sonnet, doesn’t come from Opus. Right. Which is the better model? It’s bigger, but it’s slow because the hardware’s not caught up in terms of inference speed for that. And so no one wants to use a slow model. Right. The user experience sucks.
But as far as if the model gets better at each scale of hardware spend, I would say all the tech giants believe it. I believe it. I think a lot of people in the financial community are like, this is freaking scary. Because the moment it stops, wherever you were on the rung, if we went from $50 billion spend to $500 billion spend, well, that $500 billion spend is never going to have ROI. It was one thing if $50 billion didn’t have ROI, but now this $500 billion doesn’t have ROI. It’s a big problem.
The Value of Incremental Improvements
So anyways, one could think of it as diminishing returns because when you go from $50 billion of spend to $500 billion of spending, you only move up, let’s call it one tier of model capabilities in absence of major algorithmic improvements. Right? And so I’m holding those sort of off to the side for now.
But that iterative performance improvement in the model is, as I mentioned earlier, right? It’s like a 6-year-old versus a 13-year-old maybe. Right? The amount of work you can get a 13-year-old to do is, I mean, if you do it right, we frown upon that now in this civilization. But the amount of work you can get a 13-year-old to do is actually quite valuable relative to a 6-year-old.
And the same applies to a college intern versus someone who graduated and has even one year of work experience because there’s a learning curve for kids coming out of college all the time. So there’s that learning curve and I think, you know, while it may be incrementally, you know, an order of magnitude more of compute, the amount of value…
If we just had, if you made a company full of high schoolers and you had to refresh them every six months so they didn’t learn too much, right, and become really good, it would be really hard to create a valuable company. The most you could do is dig trenches and do yard work. But then all the time these kids wouldn’t even show up. Right.
As a function of how valuable the business could you do if you had unlimited high schoolers versus if you had a business that refreshed so they didn’t build knowledge versus college students versus, you know, 25 to 30-year-olds. Right. The value of that business that you can build, even though incrementally it’s just five years between each of them. Yeah, it’s drastic, it’s a drastic value change.
Current State of AI Capabilities
PATRICK O’SHAUGHNESSY: Where do you think we are today? Which level are we at, do you think?
DYLAN PATEL: Depends on the domain, right? For software developers, I think we’re really pretty good. Right? You know, and that’s where we’re seeing the most value creation happen, right? Where you see Anthropic have gone from a billion or less of revenue to $7 to $8 billion by, you know, already. And it’s the fastest revenue ramp we’ve ever seen for anything of this kind.
PATRICK O’SHAUGHNESSY: It’s basically all code related, right?
DYLAN PATEL: I mean, some of it’s their own cloud code product, some of it’s Cursor, some of it’s GitHub Copilot, which also offers Anthropic models and has since the beginning of the year. It’s Windsurf, it’s all these different avenues to access the same thing and these companies aren’t all doing the same thing. There’s tweaks and nuances to how they’re doing things differently, but it’s all code.
And in that sense it’s like if I had 30-year-old senior engineer at Google and if that was like if I had infinite of those, all it costed was CapEx for chips and the operational cost is actually quite low, then you could build businesses worth insane amounts. You could have a replacement for the $2 trillion of wages that go to all the software developers in the world today, or rather you could augment them and build twice as much or 5 times as much or 10 times as much if you could augment them. Because these things don’t just run on their own. Right. It’s more like a force multiplier to the existing person.
So the value creation potential is there. It’s obvious if you’ve coded it all in your life. I mean, it even works for VBA. It’s not that great for VBA. So I know a lot of people in this audience probably know the VBA users.
PATRICK O’SHAUGHNESSY: Yes.
DYLAN PATEL: But like, it’s not even that terribly bad for making macros. But anyways, the value creation potential there is incredibly high. So let’s capture it. How do you capture it?
The OpenAI-Nvidia Deal Mechanics
And so sort of this draws back to the OpenAI-Nvidia deal because I think most people in the market don’t quite get it right. They’re like, oh, this is just round tripping. It is to some extent. Right. If OpenAI builds a gigawatt of capacity, they agreed 10 gigawatts of capacity, Nvidia will do $100 billion of equity investment into OpenAI in the form of cash. Right. And Nvidia gets returned capital. The first chunk of the deal in the press release is 1 gigawatt, $10 billion. Right. So pretty straight line. But 1 gigawatt to build, as we established earlier, is like $50 billion. So Nvidia is paying $10 billion. OpenAI has to come up with other $40 billion somehow.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: Right. Now what they can do is go to the markets, get a loan or get someone else to put. There’s these infrastructure funds that are trying to get into this. There’s all these different, all these commercial real estate people are trying to get in this. There’s some way where they’ll be able to figure out other people to front the capital. Right. And then come up with a deal much like it is Oracle. But OpenAI has to do more of the work in terms of setting up the cluster, the software, the networking, et cetera.
The nice thing for Nvidia is they sell, of that $50 billion, they capture maybe $35 billion of that is CapEx, that goes directly to Nvidia. So year zero, OpenAI, slash its partner spends $50 billion on the data center. The timing is not exactly that they spend $50 billion on the data center. $35 billion goes into Nvidia. Nvidia’s gross margin is 75%. I’m going to make it simple numbers. Let’s say it’s $10 billion and $40 billion because $10 billion COGS, $40 billion revenue, $30 billion of gross profit.
If we fix the numbers, it’s effectively like half their gross profit from that deal is going directly to OpenAI in the form of an equity investment. The 25% that’s COGS is staying on. Nvidia is paying for that and then they keep the other half of the gross profit on their balance sheet or do buybacks, whatever they want to do it. So Nvidia is not necessarily like they are round tripping some of this OpenAI. What effectively is happening is OpenAI gets the opportunity to pay for a big chunk of it in equity.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: And Nvidia’s lowering their prices without lowering their prices effectively. And they’re getting ownership of a company who very likely could just, and, but Nvidia comes out great because they’re getting the CapEx dollars up front. So all they’re really doing is they’re saying half of my money that’s in this, sure, it does make its way to me somehow. But in reality I still made half of that gross profit and the other half is equity in a company that may or may not be worth something. A company that may or may not be able to pay hundreds of billions of dollars of compute deals that they’ve signed. Right. In which case they’d be bankrupt. Right. So this is the mechanics of that deal.
Tokenomics and the Growth of Inference Demand
PATRICK O’SHAUGHNESSY: It’s about the highest stakes capitalism game of all time. And it’s so interesting to think about when it might run out. You mentioned if we hit that final point and we don’t see the return like we’re kind of toast in a big hole. But I’m also curious about the other side of ability to serve and just demand for like today’s models by inference. The stat I last saw is token demands doubling every two months or something crazy. Obviously there’s all these reasoning tokens that are really exciting for some of the longer thinking models. How do you think about the growth of the pool of demand for inference tokens themselves up just even in today’s models? Even if we just stop things and fix things and we’ll leave that other side of the equation just for a second. What’s your model for thinking about that today? What most interests you in the…
DYLAN PATEL: The growth of just, so the thing I like to call it is tokenomics and I stumbled upon the word actually, it’s a crypto…
PATRICK O’SHAUGHNESSY: Kill crypto, finally, once and for all.
DYLAN PATEL: I’m trying to make tokenomics SEO direct to us talking about tokenomics and hopefully you talking about tokenomics. Hopefully everyone’s using the tokenomics 20 more times. It’s the economics of the tokens, right? How much compute is being spent, how much is the gross profit? What’s the value being created by these tokens? That’s the end of the day, what’s relevant here, right?
Nvidia keeps saying “AI factory which produces intelligence,” that intelligence has value. Let’s say you have a gigawatt of capacity. What can I serve? Well, I could serve a thousand times of a model that’s really shitty. I could serve amount, right? I could serve one times amount of a model that’s good. And I can serve 0.1 times of a model that’s amazing. Now multiply that by whatever factor, how many users, what’s the number of tokens outputted. But I could do X number of tokens times 100 times a million tokens, right? Depending on the model quality.
And so this is sort of where the whole GPT-5 thing comes around, right? OpenAI had a challenging thing, right? They’re like, hey, we have a couple gigawatts of capacity effectively by the end of this year. Roughly a couple gigawatts of capacity too. More or less, a little bit less. But right now, but it’s, how do they maximize their serving capacity with this?
One avenue is we continue to serve big models and we make bigger models and the tokens are more expensive. But this log scale is really challenging because yes, the value is an order of magnitude, value is way more, but the cost is way more. And then the real whammy is the user experience is way worse, right? If I serve a massive, massive model, it’s slow and users are fickle and you need the response to be way faster than they can.
PATRICK O’SHAUGHNESSY: Hard to calibrate.
DYLAN PATEL: Yeah, yeah. So there’s this user experience challenge. But really in the end, it’s for a given model level, I think there is a saturation point of how many, how much demand of intelligence there is, right? You can only have such large child army, right, of people digging trenches or Kony 2012, whatever it is. This is very cancelable, but you could have a much larger army of, or business of the larger level of intelligence. Right.
And so when you think about, hey, what could I have done with GPT-3? GPT-3, even if we paused there, paused the model capabilities, right. Obviously the cost to serve a model quality of GPT-3 has tanked 99% or, yeah, it’s 2,000 times cheaper now. Yeah, it’s so much cheaper now for. And then GPT-4, same thing, right? People were freaking out about DeepSeek because it was 500, 600 times cheaper. GPT OSS came out and that’s even cheaper than that. Right. And it’s the same again, for roughly the same quality. Actually, I would argue the GPT open source model is actually a little bit better than GPT-4 OG, because it can do tool calling.
And anyways, the cost of these things tanks rapidly with algorithmic improvement, right? Not necessarily model getting bigger. And as these algorithms get better, you can, but at X level of intelligence, you can only serve so much demand. And then the flip side is that demand, it takes time for people to realize how to use it.
Model Evolution and Adoption Curves
So when GPT-3 launched, no one cared. When GPT-3.5 launched, it was still most people didn’t care. ChatGPT launched with GPT-3.5. People cared a little bit. GPT-4 launched on ChatGPT then. People cared a lot. But a model tier of GPT-3.5 or 3 still can be very useful in a lot of work, a lot of the world. Now it’s not useful for a lot of use cases, right? For coding, it was terrible, right? For copywriting, it’s okay, right? There’s some level of use case and it happens to four, but it takes time for that adoption to happen.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: And so you’ve kind of got this challenge of if I pause on a model capability, then I end up taking way too long for adoption and also how can I get people to adopt it if I don’t let people use it? And so OpenAI had this tremendous problem with GPT-4.0. Right. 4 and then 4 Turbo was smaller than 4 and 4.0 was smaller than 4 Turbo. What OpenAI basically did was they made the model as much smaller as possible while keeping roughly the same quality or slightly better. Right.
So four to four Turbo was the model was less than half the size and 4 Turbo to 4.0, 4.0’s cost is way lower than 4 and they just kept shrinking the cost. Now 5, what could they have done, they could have gone, oh, we’ll go big step. They actually tried that with 4.5. They screwed up some things because it was really hard to get 100,000 GPUs to work properly. There’s challenges there. Also, they hadn’t figured out the whole reinforcement learning paradigm at that time.
So they ran out of, it’s the scaling laws are a chart of quality versus compute. But that compute breaks down into how much bigger do I make the model, how much more data do I put in the model? And if the Internet only has so many tokens and you’re kind of screwed, right? So it took, there was potentially a cliff until reinforcement learning happened, where you can generate data and train the model to be better without the Internet having that data. But anyway, so they kind of had this problem of, you have X amount of compute, you can service your users. But hey, today, if people want to use my API, I rate limit them because I can’t actually serve them all.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: Oh, if I want to use, I have to rate limit the people who have ChatGPT Free Pro and Max, whatever that, whatever the $200 deals, there’s different rate limits. You can only do deep research so much. I have multiple ChatGPT accounts because I use deep researchers. You kick off a bunch, you read it and you’re like, wow, I learned a ton. Move on. Right?
So you have this challenge of you can’t actually serve your user base enough, so how are they ever going to move up this adoption curve? So then as OpenAI, what’s your choice? Do you make go from 4o to 5? Do you make the model way bigger and not be able to serve anyone? And plus, because you can’t serve anyone and it’s slow to serve, the adoption curve doesn’t really get going. Or do you make the model the same size, which is what they did for GPT-5. It’s basically the same size as 4.0 and roughly the same cost. That’s actually a little bit cheaper potentially.
And then you just serve way more users and get everyone up the adoption curve more. And then you can, instead of putting them on a bigger model, you put them on models that do thinking that can do, so if you’ve used GPT-5 thinking or GPT-5 Pro, there’s more intelligence there. And so this is the whole conundrum they have. And this is where the whole tokenomics thing comes into play.
The question you had, I wanted to level set it right, which is how do you serve these users? The demand is growing so much. I’m not doubling my hardware every two months. Right? Right. Yes, this CapEx is crazy, but I’m not doubling my hardware every two months, but I’m doubling my tokens every two months. So there has to be enough of a cost decrease. And there is, right? With at a given level of…
The Latency vs. Capacity Trade-off
PATRICK O’SHAUGHNESSY: Intelligence, if you could snap your fingers and change a dial somehow that would most unlock and unleash more development. Is it just inference latency? Because then we could do bigger models and serve them much faster in a way that consumers would enjoy. Is that the main bottleneck to be attacked?
DYLAN PATEL: Inference is always a curve, right. Like all of these things are curves and it’s a trade off. Everything in engineering is a trade off. So you have inference latency versus cost on any given hardware. GPUs can do lower latency to a certain extent, but then the cost is way higher. Or you can do really high throughput and the cost is way lower.
And the company just kind of yolo, they set the dial where they think it makes the most sense. And there’s other types of hardware which kind of aim for their curve to be at a different spot. Maybe the GPU curve is here, but latency over here, you’re in very diminishing returns. And so actually someone made a little curve right here. It’s like, okay, maybe that’s a useful point, but actually the market cares about this point. So anyways, there’s a curve of like who cares about latency? I think if I could just press a magic button.
PATRICK O’SHAUGHNESSY: Yeah, that’s what I mean.
DYLAN PATEL: Is it capacity? Is it latency?
PATRICK O’SHAUGHNESSY: What is it?
DYLAN PATEL: I think that’s a tremendous question. I’d probably still say capacity cost is more important than latency.
PATRICK O’SHAUGHNESSY: Really.
DYLAN PATEL: I think existing levels of latency are fast enough for a lot. Now if the latency was 10x lower for GPT-5, then they could have made a model that was 10x bigger and served it at this speed.
PATRICK O’SHAUGHNESSY: That’s what I’m wondering about.
DYLAN PATEL: But then you would have the same capacity issue, right? So I guess if you could have your cake and eat it, which is all the capacity in the world and the lowest latency in the world, well then you would just make the best. You’d make the models way better.
I think it’s the physical realities of, if I’m at OpenAI, what do I choose to do? Do I invest more in the model that people can use, or do I invest more in the model that’s fast? Do I invest a lot in the model that most people won’t use because it’s expensive, first of all. And even those that can afford it will often go back to the regular one.
I have access to Claude 4.1 Opus. I still use Sonnet more. Way more.
PATRICK O’SHAUGHNESSY: Just because it’s a better experience.
DYLAN PATEL: Right. It’s dumb. It’s objectively dumber, but slow.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: And my time’s worth something. I think OpenAI wouldn’t have been afraid to make a model way bigger in a terrible user experience.
PATRICK O’SHAUGHNESSY: Yeah. And as a result, we’re just going to probably have to wait a little bit longer to see what the bigger models are in practice in a way that consumers actually do with. It’s just going to be too hard.
Over-Parameterization and the Concept of Grokking
DYLAN PATEL: It’s not necessarily even bigger. There’s this whole concept of over-parameterization that is if you just throw more parameters in a neural network. And even when humans—I’ll equate it to humans. When you had a vocab test or you had some test you memorized before you understood, and it wasn’t until you did multiple repetitions and in different forms that you actually understood the content rather than just memorized. It takes cycles.
And when you do an LLM, it’s the same thing. If you throw some data at it, it will memorize it before it generalizes. It’s this concept called grokking. You grok the subject.
PATRICK O’SHAUGHNESSY: That is, it’s the aha moment of understanding.
DYLAN PATEL: Yeah. And the models do the same thing. They memorize it up until then they understand it at some point. And if you make the model bigger and bigger and bigger without the data changing, you just memorize everything. And actually it starts to get worse again because it never had the opportunity to generalize because the model was so big and there’s so many weights and there’s so much capacity for information.
The challenge today is not necessarily make the model bigger. The challenge is how do I generate and create data that is in useful domains so that the model gets better at them. Nowhere on the Internet to show you how to fly through a spreadsheet using only your mouse or not using your mouse, using only your keyboard and all these functions and all these things, right? That’s a repetition. That’s bars. But there’s no data on the Internet about this.
So how do you teach a model that? It’s not going to learn it from reading the Internet over and over and over again, which you and I could never do. And so it has a level of intelligence that we can’t do. We can’t read the whole Internet, but it can’t do basic stuff, which is play with a spreadsheet. So how do you get it to learn these things? And so that’s where this whole reinforcement learning paradigm kind of happened.
PATRICK O’SHAUGHNESSY: Giving it environments, specific environments to learn it and then fold back in.
DYLAN PATEL: Right, exactly. And that’s where there’s sort of a challenge in terms of building those environments. And so there’s 40 startups now in the Bay doing these environments, and questionable whether or not any of them will make it or what will happen. But there’s 40. And then these companies are also making their own environments. But these environments can be anything and everything.
Building Training Environments
PATRICK O’SHAUGHNESSY: Give me an example. Just one of the startups or something, just to get.
DYLAN PATEL: These startups are just making environments for OpenAI, Anthropic and others, right? So it’s as simple as, here is a fake Amazon, right? Because your Amazon Terms of Service ban chat models and all these things. But here’s a fake Amazon full of items. Figure out how to click around and purchase items. Figure out how to compare the two items and pick. I’ve generated a list of deodorants. Three of them are fake, one of them is real, one of them’s not the one I want. And here’s the prompt for how to buy it.
And it tries many things and you vary the prompt and all these things, but eventually it’s bought the right deodorant and you’ve succeeded and you fold it back in. That’s a simple thing.
Or it could be, hey, clean this data. Here’s this table. Ton of dirty data in there. Oh, there’s colons and stuff. There’s an address in one column. How do I separate out the columns? So the address is street address, city, zip code, and it’ll try a bunch of stuff, but maybe it can’t do that yet. So really you just drop it. You give it here’s addresses, here’s different formats, and you slowly, iteratively teach it. So there’s all this challenge. That’s one. That’s another example.
Another example is you’re in a game and whether it’s a Tic Tac Toe or Call of Duty or a math puzzle, whatever the game is, that’s what a lot of these environments initially have been, is math puzzles. It’s do this math puzzle. Oh, well, I can’t do this one because it’s too hard. Here’s an easier one. Okay, I can spin on this one. Okay, I’m better enough. Okay, now I can learn this one. And iteratively stepped through those to where, basically from Q4 of last year to Q2 of this year, these things hill climbed up math puzzles like crazy.
And a lot of that was not, hey, I just know the math. A lot of that was, here’s how I use Python to write something that does the math for me. And now these things are actually quite good at math.
So the environments can be super varied and it doesn’t need to be something that’s clear cut and dry. It can be, here’s a medical case, what’s wrong with it? And then you have another model say, well, here’s instructions on how you would grade the result of a case. What looks like they didn’t even try this or didn’t even look up this. Okay, you did that wrong. You can feed these models into. So these environments can be very complicated.
So building those out is a challenge. It was one thing to say, I’m taking all the Internet data, I’m going to filter it some, throw it to the model. There’s tons of engineering challenges there, for sure. There’s a different set of engineering challenges that take time to build out.
The State of Pre-Training and Post-Training
PATRICK O’SHAUGHNESSY: In those two, in pure raw Internet pre-training world and in this new environments world, what inning are we in in each of those, would you say? How far into the potential benefits have we eaten?
DYLAN PATEL: This is where the whole, oh, well, then Dylan, what you’re saying is you never need to make models bigger again, right? Because you’ve already run out of data. And until you figure out how to generate tons and tons of data, that’s great. But actually we haven’t.
We’ve seen another angle where it’s mostly just been pre-training scaling. Is Veo 3 and Banana Nano, right? These Google image and video models and Genie and all these Google image and video models. And that’s purely scaling on multimodality. The models still aren’t that great at video and audio and images. They’re fine. They could be a lot better. So there’s angles of scaling there. Because when I said we’ve run out of the Internet, we’ve run out of the text. Tons of video and image and audio.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: We just, it’s just so expensive so we didn’t get to that.
PATRICK O’SHAUGHNESSY: So maybe late innings on text, mid innings on pre-training.
DYLAN PATEL: I think we’re early on non-text. Yeah, we’re quite early. And then the other angle is, just because you’ve used the text doesn’t mean you can’t learn faster. You take a class, you give them all a book, you tell them to read it once and you test them all. It’s well, one kid’s going to get 100, one kid’s going to get a 40. It’s just the reality of life.
And maybe if you read the book out loud to them, the kid who got 100 might get a 30 and the kid who got a 40 might have got a 60. So there’s these different parameters and when we talk about model architecture, the same thing happens there.
So it’s not like you stop training new models. It’s not like you don’t have algorithmic improvements or smarter kids. It’s not like pre-training is done. In fact, it’s the base of everything. So you want to keep having gains because any gains on pre-training—the model learns a little faster or the model’s a little bit smaller for the same quality—feeds into the next stage, which is this whole post-training side which will subsume the majority of the compute at some point.
PATRICK O’SHAUGHNESSY: And inning wise, are we in the second inning of that?
DYLAN PATEL: I think we’ve thrown the first ball. Wow. Because think about how we learn.
PATRICK O’SHAUGHNESSY: There’s so many environments.
Reinforcement Learning and Human Development
DYLAN PATEL: I think my favorite thing, my brother just had a baby. This baby will literally stick his hand in his mouth. I’m like, you know, I thought about it and then it’s like, wait, he’s calibrating the senses on his fingers by sticking his hand in his mouth because his tongue is the most sensitive thing. He doesn’t know he’s doing it, but that’s how he’s calibrating. He’s like, oh, that’s me. Oh, I can touch and feel. Right. It’s like, how does the model learn these sorts of things? Right. It’s like you just have to try stuff and fail.
And we’re so, so early in, like, you know, think about how much we see throughout our life and how much of that information we throw away. Right. We throw all of this information away. I don’t remember anything about like, you know, like, do I remember what I had for lunch yesterday? No, but if it was amazing or bad, I would have remembered that. Oh, I don’t like this or I like this. Right. It’s sort of like, you know, there’s all this information we throw away and these models, these environments. Yeah, we’re generating tons of data and throwing most of it away and training the model. But it’s like infinitesimal compared to what humans have done.
And so I think there’s so many environments you can put the model in. There’s people even think you don’t get to the magical AGI until you embody it. That is, you put the model in something that can interact in the real world. Right. As a, in a robot. I think like Elon and xAI, they’re a bit more along that angle of like they think embodiment is required to get to artificial general intelligence. Because you need the model to be able to say like, pick this up or like, oh, wow, this is like a rotating thingy, which you could never get from like just watching a video about it. You wouldn’t get the concepts of it even. Yeah. And so I think we’re so early in the reinforcement learning because that’s what humans are. We’re reinforcement learners.
PATRICK O’SHAUGHNESSY: And the, so what of, let’s say we fast forward and we’re in the seventh inning of that or something like this. What do you think the way that the average person will most feel that difference in terms of the utility of the model?
DYLAN PATEL: It’ll be very different, like modus of using it. It’s one thing to ask for information or ask it to organize information versus it just doing things. Those 12 year olds, you need to really direct them how to dig a hole because a lot of them haven’t dug a hole.
PATRICK O’SHAUGHNESSY: But you’re talking about order me this vitamin and just like it’s just done.
AI-Powered Shopping and Purchasing Decisions
DYLAN PATEL: Right. And we’re actually not too far away from that. I think if you try and research electric toothbrushes, like, this is something because, you know, your electric toothbrush, I lose it. I leave it at a hotel all the time. And I’ve been obsessive about this. Like in 2021, I made a spreadsheet of all the electric toothbrushes because based on how many ICs were in each one of them, right? Like, this one has a Bluetooth IC. Why I don’t. This one has a display. I see like, it has a color display. I see, like, what’s going on, right? Like, I made a spreadsheet of all this and so, like, I don’t know, it’s like this weird little thing that I do.
But I’ve been finding like every, every, you know, how I research which toothbrush I want to buy now. I bought an Oral-B iO like series 9 or whatever, right? Like, whatever it’s like. But it’s like comparing them. Like, these models now can actually figure out exactly what you want. And more than 10% of Etsy’s traffic is straight from GPT. Wow, Amazon blocks GPT, but otherwise it would be really high. People make purchasing decisions through GPTs. They just don’t make the purchase.
OpenAI’s head of applications or CEO of applications was at Shopify and created the shopping agent, right? This is very clear. This is how they monetize. The models are going to purchase for you, right? They’re going to do actions for you and the model and then the company that does those actions for you, the model that will be able to take some sort of take rate, right? Even if it’s like 0.1%. Even if it’s 1%, it’s 2%.
PATRICK O’SHAUGHNESSY: It’ll be like a credit card transaction.
DYLAN PATEL: Visa is the most amazing business in the world because of this, right? And chat could be that too. If I’m making my decisions on purchasing all sorts of things, I mean, I already almost outsourced, like, what am I going to eat to the front page recommendation of Uber Eats sometimes. I already outsource a lot of decisions. It’s not too much further till I’ve completely outsourced a decision and a purchasing intent.
That’s what’s made Meta and Google such amazing companies is they figured out how to get the thing you want to purchase in front of you as best as possible, right? And all their work on recommendation systems is figuring out what you like, how to keep you on the platform longer, whether it’s YouTube or Instagram or, you know, or ByteDance right, with TikTok. Or it’s, hey, here’s the ad of the thing you’ll probably click on and buy, because that’s how I get paid. And everyone likes to claim they don’t pay attention to ads, but you do, right?
Reasoning and Scaling Laws
PATRICK O’SHAUGHNESSY: Before asking even more holistically, kind of your view on where we’re going. There’s a third category, which is the reasoning part of the equation. So we’ve got pre-training, we’ve got RL and environments, post-training. What about just like raw time spent reasoning and where that going as its own independent part of the overall scaling law?
DYLAN PATEL: The scaling laws, again, if you zoom out, it’s not actually what original paper is, but in spirit, sure. Scaling laws are more compute, better intelligence. And that could be bigger and bigger model. Each iterative token is better. Whatever word garbage I spew out. If I went back and I wrote about everything I talked about in this, I could make it way more condensed and it could be way more clear potentially. Right now, the benefit of podcast is…
PATRICK O’SHAUGHNESSY: A lot of times people are more fun this way.
DYLAN PATEL: Driving. It’s fun. Yeah, exactly. Right? They’re walking their dog and they’re listening, whatever it is. But the interesting, an important thing here is that by putting in these environments, you’re teaching it like humans, right? If I asked you, you know, to go figure something out, right, you might not necessarily know the answer right away, but I know you could probably figure it out in a given amount of time. That’s reasoning. You’re spending more brain cycles.
The magic again of intelligence, of humans, of people, is not that they are information retrieval. Like the best information retrieval, right. Like, GPTs are amazing at information retrieval. We’re really good at, because we’ve been trained in these environments, which is our world, at figuring out how to do things iteratively. And so reasoning and these RL environments are linked together. Right.
If I’m telling a model, hey, do this math puzzle. It’s not just spewing out like, oh, the answer is one. Oh, the answer’s two. Oh, the answer’s three. Okay. The answer was actually seven. And when I got there, I trained it again. It’s like, okay, now it knows. Next time. Oh, the answer is six, seven, or eight. No, it’s seven. Okay, great. It’s not now it instantly knows the answer. It’s actually like, oh, here’s this puzzle. Oh, these numbers. Oh, this line. It’s Sudoku. These numbers add up to this. Oh, it has one through nine, but it’s missing eight. Okay, it’s eight, right? It’s thinking through it, right? Like you and I would solve a Sudoku.
Now, eventually, when you get good enough at Sudoku, you could probably just spit out an answer. You could do it in your sleep, but for a long time, you can do it without, you know, so sort of this reasoning time is a way of spending more compute, more brain cycles on the task without actually scaling the model. And then the model becomes more versatile, right?
Because humans have a rate. If I just held a match against you and you didn’t notice it, you’d immediately jerk, right? Because the rate at which you operate is hundreds of hertz, right? Your body can actually take actions at hundreds of actions per second. If you look at a fighter pilot’s reaction time, the peak of human reaction time, now, what reaction can they do is completely primal, instinctual, right? Very little thought is put into it.
If you think about this alien intelligence that we’re trying to make, is it immediately going to one shot the answer always? No. But at times it needs to. Yeah, at times it needs to be able to, you know, tell me exactly the answer in 2 seconds or half a second or, you know, whatever action it needs to take immediately. But a lot of times also needs to think through the problem, go and do stuff.
That’s why you hire students, that’s why you hire interns, because you’re like, yeah, I know this data exists. Here’s the format. I kind of want it on, and go figure it out. And then they spend a whole summer doing something you could have done in three days, but great. They learned a ton, right? And it’s like, these models need to go through that progression.
And so when I think about, you know, reasoning RL it’s a lot about how the human psyche and intelligence works. And sort of I wouldn’t say, you know, there’s a caution of trying to make it too much like humans because it’s not. The fundamental substrate is not like humans. The processing is not like humans. Our brain is very different from, you know, how these ALUs on a chip works. The scaling of these things is very different. The raw speed, the amount of words, everything is so different. But at the same time, it’s important to reckon back to what actually makes people, you know, smart.
Memory and Attention in AI Models
PATRICK O’SHAUGHNESSY: On the topic of embodiment. And continuing with the human analogy, how do you think about things like short and long term memory in a human versus just like raw model capacity or something? Like, what role does that analogy of memory, I don’t mean literally like semiconductor memory, but memory in a model. How do you think about the importance that that will play and where are we in that?
DYLAN PATEL: The magic of Transformers was attention, right? That is, I calculate everything in my context length, I calculate the attention to each other, right? Basically, in a vector space like king, queen, there’s these vectors. There’s dozens of vectors for each number. And king and queen are actually exactly the same on a ton of stuff. But then it’s the opposite on one number because one’s a male, one’s a female. And then that will have a lot of other ramifications throughout other literary stuff.
Like, you know, what adjectives do you put with a male of, you know, this vector? So it’s like regal and, you know, powerful and could be ruthless, whereas a queen could be like dignitary or whatever. Stupid analogy. But when you think about how that applies to, you know, humans, what we’re terrible at, exact recall. I could tell you a sentence and tell you to repeat it.
PATRICK O’SHAUGHNESSY: Yeah, it’s like six numbers the average person can remember or something like that, right?
The Challenge of Context and Memory in AI Models
DYLAN PATEL: But you get the gist of the sentence. If I told you a whole paragraph, you’d get the gist of it. And you could repeat the meaning of it to someone. You could translate that meaning. So models, very different, right? Fundamentally transform. Our attention has been calculating the attention to everything to each other and getting the models to actually be able to recall. That’s been a training data problem.
But you can get the model to repeat exactly what you want, anything in its context length. It’s like a needle in the haystack. It’s a benchmark that people did for a while because models had to get good at that. But now models are just amazing, right? Tell me, blah, blah, blah, and random part of your context.
But what they really suck at is having infinite context because you have infinite and it’s what the real word is, sparse, right? You have sparse. You’ve taken this entire world and you’ve encoded it in such a small amount of data that lives in your brain, and it’s so sparse, but you understood how to grab the fundamental reason and put it down there. Whereas models, they haven’t been able to create something sparse yet, right? What is the long—how do you reason over the context of infinity?
And humans, maybe we have a short term memory and a long term memory. I think it’s a lot more blurry than that. There’s no clear line, oh, this is in my short term memory. Oh, this is in my long term memory. It’s much more blurry, but as we go back and back and back, it’s more and more sparse, right?
If we think about, hey, what do you remember as a kid? The most crazy thing in psychology. I remember when I learned it, I was like, wait, my memory of what I did as a kid with my dad at this thing, right? Is fake. It’s me remembering it and inventing the picture and me remembering that picture successively. But the actual memory of what happened is morphed a little bit over time because it’s a spot. The way humans collapse information is super, super dense, but we are able to extract all the relevant information out.
Now models, there’s a ton of research going on in this domain of long context, right? How do I get longer and longer context without blowing up my model cost? This is a big challenge with reasoning. This is why we had this HBM bullish pitch for a while, right? It’s like, you need a lot of memory when you extend the context, right? Simple thesis, right?
But the fundamental algorithm needs to change and improve over time iteratively to get to something like this short and long context of memory. That doesn’t necessarily mean the model has to work like we do, right? Why can’t the model just reason and have a database that it writes stuff in? Or a word document that it writes stuff in and then it takes it out of its context, works some more and Ruficrul calls back, it’s like, oh yeah, right?
We don’t do that, right? You and I refer to our notes, we refer to our calendar, refer to our text, we refer to anything, all the shopping list, right? Great, I know I need food for dinner. I go to the store, I’m like, I need a shopping list, right? Because otherwise I’m going to buy stupid shit, right?
So the model doesn’t necessarily have to fundamentally work the same way as humans. But there is that challenge of how do I train the model to operate over the context length of a human? How do I train it to interact with these databases and these word documents that it writes to? Because it’s never going to learn that from pre-training, has to learn that from an environment. But these environments have to be architected in a way where the model knows it can write stuff down and refer back to.
OpenAI’s Deep Research Breakthrough
And so one of the first things OpenAI did was Deep Research. Right. Deep Research is everything is not in Deep Research’s context. Right. Deep Research is working for 45 minutes. It’s outputting millions and millions of tokens and it’s creating this amazing thing that it wrote. Right. And it’s pretty good research. I would say a lot of memos that you read from people are on par with Deep Research, at least like a junior.
How did they do that? They enabled it to be able to write something down elsewhere and have this recall and effectively use language to compress information that it looked at, put that off to the side, use language to compress other information off to the side, use language to compress other information off to the side and then looking at all this compressed information and writing something. Right. And that’s sort of what Deep Research is.
So how do models get there? I’m not sure. Right. I think it’s a fundamental research challenge. It’s why these companies need millions of GPUs to train on. Not for, oh, I’m going to make a million GPU model, but because I need to try a bajillion different things because I don’t know what will work. And what’s going to work for humans is so different from what works with models. There’s any number of parameters or things you could tweak that could end up changing how it develops. Right. And how good is it at if I do it this way versus that way? Right. That’s the whole point of ML research is you’re constantly trying stuff out and trying to get better and better.
Calibrating Bullishness on AI’s Future
PATRICK O’SHAUGHNESSY: If I add all of this up and hold the mirror up, it seems like I would put you in the category of unbelievably bullish on what these things are going to be able to do in 10 years time or something. Pick your time frame.
DYLAN PATEL: Yeah.
PATRICK O’SHAUGHNESSY: Am I calibrated the right way? Amongst everyone you talk to who you respect and think is—
DYLAN PATEL: I’m much more embarrassed than a lot of people, actually, which is the crazy thing.
PATRICK O’SHAUGHNESSY: Help me understand that distinction. If you’re—where are you 1 through 10 amongst the people that you respect? 10 being the most bullish. And then, what is the difference between—if you’re not a 10, what’s the difference between you and the person who’s a 10?
DYLAN PATEL: I respect you, but I know I’m way more bullish than you. And I respect Mark Zuckerberg, but I know he might be—he’s probably—maybe. I don’t know if he’s more bullish than me, but I know Sam Altman is definitely way more bullish than me, right? He says we have artificial general intelligence in less than a thousand days. Right?
Or Dario, I respect him immensely, but he’s way more bullish than me. Right? My roommates are—one of them is an Anthropic ML researcher, and one of them is—there’s another podcaster, Dwarkash. They’re both way more bullish than I am.
PATRICK O’SHAUGHNESSY: Really?
DYLAN PATEL: Yeah. But even they are not as bullish as some researchers in this field. So it’s like—but then if I go talk to someone I respect, I don’t know, a famous investor, right? You know any of these famous investors? I don’t want to name one because I’m scared, but there’s all these famous investors, right? It’s like, well, no, they’re not—they’re not more bullish than me. And the stuff I’m saying sounds like crazy shit.
PATRICK O’SHAUGHNESSY: Some of it, though, is timeline. I’m actually even more curious about the upper limit. The extent there is the upper limit.
DYLAN PATEL: I think I’m among the most bullish.
PATRICK O’SHAUGHNESSY: You can get, because that’s what I mean.
The Upper Limit: Digital Gods and Economic Value
DYLAN PATEL: The upper limit of this is that this will just be smarter than humans. I don’t think that will happen anytime soon. Even if that doesn’t happen anytime soon, there’s so much valuable stuff that can be done with these models that economically we will skyrocket. There’s so much value that can be created in the world just by, hey, if the models know how to do COBOL to C and Python migration, mainframes migrate everything. Migrate everything from mainframes to cloud—the world is how much more efficient.
Hey, making all these random applications and automated reports and stop using Excel as a database. But instead you can make a real database and manipulate stuff in Excel, but there’s all sorts of humongous business efficiencies that could happen or automation that could happen without the model ever being—we could literally just pause it at six months from now time frame of how good it is at software development and it would be godsend in terms of how much efficiency and value can be created for the economy. And it doesn’t ever have to get to digital God level.
Now I do believe we’re going to get the digital God eventually. Eventually. Is that 10 years? Is that 5 years? Is that 100 years? Is that a thousand years? I don’t know because there’s so many unknown unknowns like I mentioned, right? These babies are putting their freaking hand in their mouth to calibrate and then later they put their foot in their mouth. They’re like, oh, that’s my foot. Oh, here’s the senses on it. And then they can pick up stuff in their hand and they no longer have to put it on their most sensitive part of their body because they know what it is. Or they’re like, oh, this is a speck on the ground. What is it? It’s not food. But now I know what it feels like inside my hands and I’ve calibrated, right?
It’s like the models have not gotten there yet, right? It has no idea how to do this. Digital God is like, well one, I kind of believe in embodiment. You need non-digital God. You need a physical and the capability of having touch and feel and all that to truly have an experience like humans and be smarter than us in every way. But that’s so far away.
The Robotics Challenge: From Wine Glasses to Warehouse Floors
PATRICK O’SHAUGHNESSY: What do you think about what Physical Intelligence is doing? Attacking the, whatever you want to call it, large movement model or large robot model or something.
DYLAN PATEL: What are they actually doing today is like, holy shit. It’s so simple in terms of to a human. Yeah, it’s like to models it’s picking this up is freaking hard. How much do I squeeze my pinky versus this finger versus this finger versus this finger. I don’t know. But you pick up a glass of water and you tilt it and it’s like this is impossible for a model today. And it’s likely at the level of dexterity—if it was a wine glass and I was swishing it, think about how simple that is. You don’t even think about it.
But you instinctually pick up a wine glass and you swish it and it lets the aroma out and you smell it, but it’s like, oh, that little swish is so much tactile feedback and movement and it’s like these models can’t do that shit yet. Nowhere close.
So, I mean, I think, yes, we have—but it doesn’t need to be that good. It doesn’t need to be able to swish a wine glass, not break the wine glass and put it back down and tilt it perfectly and not spill. It doesn’t need to be able to do any of that to be tremendously valuable. What it needs to be tremendously valuable is pick this up and put it down here after knowing what it is. So there’s so much value that can be created just by being really good at getting data.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: Yes. I think the robotics world is huge. I think we’re—we’re warming up. We haven’t even left the dugout. Right. We’re nowhere close to the scaling on robotics. There’s a ton of—the data flywheel needs to get going there.
The Talent Wars and Automating Research
PATRICK O’SHAUGHNESSY: One of the most interesting things, the subplots of this whole world is the talent wars. And a cool idea is that as these things get better, maybe we begin to automate some of the research function that people formerly would have played. Do you see a world where we’re squeezing down the fewer and fewer number of people that really matter, that will have all the impact on where we go in terms of net new research. And that means that all this crazy spending that’s happening at Meta or elsewhere makes a lot of sense that maybe even those numbers should be higher or something like this.
The Value of Top AI Talent
DYLAN PATEL: I think it’s tremendously hilarious that people are like, oh my God, this person’s getting paid a billion dollars. It is infeasible. It’s like, how could this person possibly be worth that much?
Well, they’re running the experiments on chips that cost $100 billion. If every wasted experiment they do, if they just use like a third of the compute and their ideas and their impact on it wasted. The compute was an idea that was already done. Or like, there’s so much wasted compute. I call it wasted. It’s trying stuff and failing. But none of us know what to try and what not to try and these things are so complicated.
There’s like a group of people just trying different stuff on the existing data. How do you mix it? What order do you feed it into the model? How do you filter it? What’s the architecture? There’s different people working on long context. There’s different people working on every single aspect of the model that like, if you just make them a little bit more efficient, that they come up with the idea that’s 5% more efficient. Well, fantastic. I just saved not only 5% of my compute time, training time, I also save 5% across my entire inference fleet.
And then I do it again and again and again and again because we’re so far away from like these models being anywhere near as efficient as a human brain and we know it can at least get as efficient as us. Maybe the compute substrate isn’t the same, but like, whatever, right?
Adding more people to the problem doesn’t make it faster. Right? Because there’s so many things you’re trying. You run these experiments, you learn something and then you implement it. You tweak the knobs in these ways in a hundred different ways, and then you see the trend line and you’re like, oh, so actually I should tweak it this way. Let’s implement that. Right?
There’s so much gut feel, there’s so much reading data, understanding, implementing, reimplementing, learning by doing that, if you add people, you’re going to slow it down in a sense. A lot of Meta’s problems before they did the super intelligence thing was that they just had too many people that weren’t led by leadership that was amazing. And they had like a lot of failed experiments and wasted time doing things that didn’t matter.
There’s a tweet from one of my friends at OpenAI. He’s pretty famous on Twitter. His name’s Rune. He’s like, I get visibly, viscerally angry every time I think about how many H100s Meta’s wasting. It’s such a funny tweet because it’s like, well, yeah, they’re wasting a ton of compute. They were, maybe they still are, and everyone’s wasting compute, right? OpenAI is wasting tons of compute because what’s the Pareto optimal model architecture? Who knows?
The Global Talent War
PATRICK O’SHAUGHNESSY: Another thing I saw Rune say recently, which was so interesting, was, why don’t we just go make even more ridiculous offers around the people that have process knowledge for things that we want here in the US in other countries. Why don’t we? If we’re getting pretty good at the Arizona fab that we’ve built and we think that we can sort of extract the process knowledge from the people, why don’t we go acqui-hire all the best people in Shenzhen or all the best people in other places in the world.
Do you think it starts to escalate to that level? So much is dependent on the process knowledge of a relatively small group of people. And the war, the talent war should actually be, it shouldn’t be Meta and OpenAI. It should be like the US maybe through Meta and OpenAI and people from all over the world. Do you think it starts to get that extreme? And should it?
DYLAN PATEL: That’s almost a function of why Intel has fallen off a lot, right? It’s like you have all these geniuses in nanochemistry and PhDs and all these random things, whether it be chemistry, physics, all these incredibly smart people, but there’s a whole class of incredibly smart people that never went that way because they’re like, oh, those guys are making like $200,000, why would I do that? I’m going to go to Google and make $800,000 and now I’m going to go to OpenAI and make $10 million. Or now I’m going to go to Meta and go make $100 million. Right?
Any smart 18 year old is going to be like, fuck that, I’m doing this right. Why do the smartest doctors, and I don’t mean to say the smartest doctors in a general sense, but there’s a really smart population of doctors that want to be dermatologists and anesthesiologists. It’s like, is that the most valuable thing for them to do? No, but those are the two professions that give you good working hours and great pay.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: Not to say that the general doctor is not as smart as them, but if you took the population of general family doctors, just the random doctor, and you took the population of dermatologists, the newest coming out of school, the ones that are being dermatologists and anesthesiologists are way smarter or at least scored better, were able to get into the field that was coveted.
And so yeah, talent war is like, it is truly, we’ve sort of been through this process of capital has, it’s always been human capital and capital goods, sort of those two vying off of each other and for a long time with mechanization, industrialization, we had the human capital decreasing as the industrial capital increased. And sort of that got to a point where especially in the 70s, it really started to tank as the ability to globalize. And all these things started to really hit the US and that’s why you have a lot of the population level dynamics and income inequality that we have today. That is very bad for the psyche of the US and the stability of it.
But then you have now, we’re in such an age of, well, actually manufacturing things is pretty commodity. Most of the value doesn’t come from the manufacturing of it. It comes from the creation of the idea.
One thing Jensen told me, which I thought was amazing, right? He’s like, Dylan, he’s like, the reason America is rich, people have it all wrong. The reason we’re rich is because we’ve exported all the labor, but we’ve kept all the value. And that’s what Nvidia does, right? They’ve exported the labor of making their chips. And Apple, right, everyone, it’s done in Asia. And those companies make money.
PATRICK O’SHAUGHNESSY: Not as much money as Nvidia and Apple, right?
DYLAN PATEL: All the gross profits are going to them and then they’re either reinvesting it or buying back stock or whatever. However they allocate the capital is a different concern. If, as you said, the process knowledge is so valuable, why aren’t we doing this? That’s a great idea.
PATRICK O’SHAUGHNESSY: Rune’s idea, not mine.
The Challenge of Identifying Talent
DYLAN PATEL: Yeah, I mean, I think the challenge is how to choose people. Really difficult. So for some roles, someone who can talk the talk, they’re great, right? People just automatically assume they’re great because they can talk the talk. But you know how many people suck at talking and are really freaking good at doing. But then you don’t know. You don’t know, right? Because it’s like, well, but then there’s people who talk about being able to do better than the person who’s doing. And then these tests are never as good, right?
Work trials, how do you select? And this was a big challenge for Meta. So some of the criticisms are like, they didn’t get all of the best people. They actually got a lot of bad people as the cope from OpenAI and Anthropic. And these kinds of companies are like, no no, no, they didn’t get our best people. That’s what Sam said, right? He’s like, they didn’t get our best people. It’s okay. Meanwhile, he did have to do counteroffers internally, right? So it’s like, you know.
So as far as the process knowledge, I think the ML researchers are an extreme of how much value one can do. But my favorite analogy that I came up with recently is that ML research is the exact same as semiconductor manufacturing. There’s a ton of jobs in semiconductor manufacturing that don’t exist in ML research, but it is a ton of tuning a thousand different knobs, right?
Oh, you put the wafer in this tool, you’re going to change the pressure of the chamber when you’re doing the deposition, or you’re going to change the mix of the chemicals flowing in or which chemicals you’re putting in, right? What speed you do it at. Do you do it for 30 minutes? Do it for 31 minutes? Do you do it for, obviously, it’s way more granular. There’s so many knobs on every single tool.
PATRICK O’SHAUGHNESSY: And you have a thousand input and process knobs, right?
DYLAN PATEL: Process knobs on each tool. Plus it’s like the sequence of them all. And so you frankly cannot test everything, right? It’s impossible. It’s too large of a search space. Just like designing a chip is too large a search space. You have 100 trillion transistors. How can you possibly try every single thing? Impossible, right?
You just have to have enough intuition, pick that point, pick that point, pick that point. See the data. Oh, okay. I think the answer is here, right? And then just yolo, right? And obviously once you think the answer is here, you test here, you’re like, okay, here. But a different person might have seen these three and then said, okay, the answer is actually here, not here. And the data is fuzzy. It’s like somewhere in the center. But it’s this whole idea of ML research is you spend a lot of time on compute training doing what effectively were useless things besides teaching yourself what’s the right thing to do and what’s the wrong thing to do.
And semiconductor manufacturing is the same way. And actually all process manufacturing is the same way. If you’re iterating super fast and you’re trying to get better and better and better, or you’re optimizing a process on a chemistry or whatever it is, you try, you fail, you learn, you do.
In semiconductor manufacturing, maybe it’s just running tens of thousands of wafers. And so your R&D cost of an Intel or your cost of your main fab that is running the R&D is very, very high and it’s producing zero economic value. Besides that, it’s teaching you how to do the next node, which then you can deploy at volume. And that is what actually makes the money.
Power Dynamics in the AI Ecosystem
PATRICK O’SHAUGHNESSY: I want to go back to where, all the way to where we started and ask about what I’ll call the wellspring or the fountain of power in this whole ecosystem. So I want to understand how you think about who has the power and how to keep or generate power as a business. I mean, because it seems like talent, maybe talent is like the very beginning of the chain and he who has the talent, on a long enough timeline, has the power or something like that.
But also there’s structural stuff like just the scale, the industrial scale of some of these things. It just takes forever to build or whatever. How do you think about even smaller zoomed in examples like, okay, Cursor is unbelievably popular. The revenue is insane. So much of it goes back to Anthropic. Who is the power in that relationship? How does that dynamic change over time?
It just seems like the power dynamics are so, so fascinating in this world. And I’m curious where you think it comes from in the first place. Where it exists today and where it will go in the future.
DYLAN PATEL: When we think about the power structures, right, you mentioned a really interesting one. Does Anthropic hold all the cards in this Cursor relationship? Cursor has nearly a billion dollars of revenue now if you do current month times 12. Yeah, that’s a ton. But again, their margins are what they are and they’re sending most of it back to Anthropic. Some people say their margins may be negative. I think they’re slightly positive. But regardless, they’re sending most of it back to Anthropic.
PATRICK O’SHAUGHNESSY: The gross profit dollars are at Anthropic right now.
DYLAN PATEL: But then Anthropic is taking all the gross profit dollars and putting them into compute.
PATRICK O’SHAUGHNESSY: Yep.
DYLAN PATEL: For training.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: So then all those gross profit dollars.
PATRICK O’SHAUGHNESSY: Are going to Jensen laughing hilariously.
The Power Dynamics of AI
DYLAN PATEL: Well, maybe Jensen or maybe like Amazon, who’s then sending it to Google, who’s sending it to Broadcom. Right. Like the gross profit dollars are going to the hardware layer from all of this for sure. But like does Anthropic have all the power? Like you know, like the common view is yes from a lot of people.
But then it’s like, well but Anthropic only makes the model that’s generated the code. There’s a lot more in the system, right? Cursor gets all of the data they get, all of the users they get. How do they interact with this? Anthropic doesn’t get that. They get prompt, they send a response, prompt, response.
Now they have cloud code which is like taking share and it’s very different than Cursor but like, you know, anyways, like they get prompt response and then like Cursor is like, oh well I’m training embedding models on your code database and I have, there’s actually multiple models that I’ve made. I’ve made the embedding model, I’ve made the autocomplete model, I’ve made, you know, oh, I can switch the anthropic model to OpenAI model whenever I want to. I’m only using Anthropic model because it’s the best one.
Oh, and because I have all this data, maybe I can train a model and not for everything better than you, but for the segment better than you. And it’s so it’s like the power dynamics are, you know, it’s weird. It’s their frenemies, right? Everyone’s a frenemy, right?
Same as OpenAI and Microsoft. The most crazy power dynamic that’s going on in the world where they signed a MoU that said they had an understanding of like what the deal would actually be for them converting to for profit. Like what is going on here? Like this sounds like the most non announcement announcement ever. The power dynamics of this all. It’s the most fascinating soap opera ever, right?
One of my friends was telling me about K Pop Demon Hunters. I don’t know if you’ve heard of this.
PATRICK O’SHAUGHNESSY: I have a nine year old daughter, so it’s all I hear about.
DYLAN PATEL: You’ve seen it a lot. I just heard about and they’re like oh, let’s watch it. I’m like what? Whatever. But like there’s drama, there’s like, but like this real world power drama is way cooler than this. Like at least for you and I.
PATRICK O’SHAUGHNESSY: Which parts of the drama interest you personally the most? Like what? Where do you think the stakes are the highest in the various subplots?
The Microsoft-OpenAI Dynamic
DYLAN PATEL: The Microsoft OpenAI one is absurdly interesting because at one point like 2023, it was like, Microsoft’s going to own the world. Yeah, right? 2024, a lot of it, too. And then like H2, 2024, Microsoft backed down a lot, right? They pulled back because Amy Hood and whoever else at Microsoft, maybe Sundar, whoever, we’re like, maybe we don’t need to be on the hook for $300 billion. We’re not going to build out $300 billion worth of compute for OpenAI. Like, that’s. They can’t pay for it, right? It was like, at least had to go through their head when they cut back.
And so they paused a bunch of data centers, right? And they said, oh, you know, we don’t need to be the exclusive compute provider. You can go to Oracle, it’s fine, right? Like, and they relinquish this power right? Now, Oracle has that deal, but then like, OpenAI sends like 20% of their revenue to Microsoft or API revenue or something like this.
And then, you know, they have. Microsoft has this like, 49% cap profit structure on OpenAI. And then there’s like this whole, like, IP sharing, like this deal, like, it’s like, really hard to understand the mechanics of the OpenAI Microsoft deal, even. So you have this whole power dynamic and they’re trying to renegotiate this.
Like, OpenAI doesn’t want. And the whole deal is like, oh, one, we have AGI. You no longer have API rights or IP rights. And it’s like, what the hell did that mean? Right? Like, if you ask someone 20 years ago and you put them in front of ChatGPT, you know, AGI, it’s like this is AGI, like it knows everything and it could have a conversation. I can’t tell. It’s not a human, actually. I can tell. It’s way smarter than a human. Yeah.
But now it’s like, ah, whatever, I can’t do xyz. So the thing that the bar always moves, no matter what the level of intelligence is. And for me, it’s not going to. It’s going to be like one thing puts its hand in its mouth and it’s like, yeah, this is me because I’m a human, right? Like, you know, that’s sort of like the sentience, the consciousness of it. All, right? That’s one power dynamic that’s crazy what’s going on there.
Nvidia’s Dominance and Strategy
Another power dynamic is the one around Nvidia and the hyperscalers, right? Nvidia is the king. All of the gross profit is going to them today, right? Pretty much all of it. Sure. TSMC makes some, sure, S.K. hynix makes some. But they have to invest a ton in Capex. Sure Broadcom makes a bunch and Broadcom makes a ton of gross profit off of these companies.
But like Nvidia makes by far the most gross profit in the industry and it’s not even close. And so going back to the analogy of like well they’re king and they want to continue to be king and they want to make sure GPUs continue to be most used. But also like they can’t buy anything. Like they can’t buy any companies. They weren’t even allowed to buy ARM when they were like a nobody. Right. I don’t mean nobody, but they weren’t like, they were like pretty much a nobody on the grand scheme of things.
And they weren’t allowed to buy ARM, you know, in like 2020 or whatever or 2021, whatever the time frame was. They totally could not buy any major companies. You know, they’ll buy smart startups. Like you know, I bought a startup that I was like a seed investor and like an advisor in like all these things like recently. But like they can’t buy a real company so what do they do with all this cash flow and like, sorry but you’re a loser if you just do buybacks.
PATRICK O’SHAUGHNESSY: Yep.
DYLAN PATEL: Like that’s admitting you can’t get higher returns on your capital. Which is fine. Like you know, meta, Apple, more mature. Yeah, they were mature companies for a while. Guess what? Those companies are not going to do buybacks ever again. Right? Or not like ever again, but for a while. Because like they have way more, they think there’s better ROI for their capital now.
And Nvidia like, you know, if you look at Jensen, he’s like, he’s always flirted with buybacks but mostly he’s been like reinvesting in the business.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: But you can’t reinvest that much into the business. Yeah.
PATRICK O’SHAUGHNESSY: So like how do you demand guarantees? He’s doing like all this crazy stuff now.
Using the Balance Sheet as a Weapon
DYLAN PATEL: Yeah, right, right. He’s using his balance sheet to win. Yeah. Try and win more. Right. Which is an interesting dynamic. I don’t know if there’s ever been anything like this in terms of the non anti competitive nature of this. Right. Like where you backstop clusters so that, you know, like Corey recently got a deal with Nvidia where it was like they backstopped a cluster right now core we would have never built this cluster because it’s for short term demand.
And renting GPUs on short term is like a terrible business model, right? You want to do long term contracts, and you want to do long term contracts to people with balance sheets, that’s the golden goose. But that doesn’t exist so much. So you do long term contracts with people who don’t have a balance sheet like OpenAI. And if you can’t do that, then you’ll do, you know, short contracts with people who don’t have, who do have a balance sheet. Right.
Like there’s this whole matrix of who you rent GPUs to, but from Nvidia’s interest it’s like, you know what I really love is when venture capitalists fund a company and then 70% of the round is spent on compute. They love that. Right? And that’s what’s happening with all these companies. Like these companies and like it’s like whether it’s physical intelligence, they’re spending a lot on robot arms and shit too. But they’re also spending a lot of compute. Or it’s like, you know, any other startup that’s racing Cursor or whoever, right?
And even if it’s not directly, it’s indirectly going to gpus they love, they love when people spend their entire round on GPUs. What would be really good is if it wasn’t like a two year deal or three year deal for that compute. If it was, oh yeah, you can spend 70% of your round on one training run. You know, leave a company with these ideas, gather the data, do the training run and then you have a product and like you try and you show how good the model is and you try and raise again.
That’s what would be really great for Nvidia. But no one wants to build a cluster who’s predicated on that as a business model. That’s crazy. So they have to backstop a cluster, do that. Or hey, you know, OpenAI might go to, you know, their own chip. They might go to some ASIC from another company, right. They might even buy TPUs. There’s a there, you know, they might even go to Amazon, right? Like they don’t really care. They’re not beholden to Microsoft anymore.
PATRICK O’SHAUGHNESSY: Or serve a product to a customer. Yeah.
DYLAN PATEL: And they want to build digital God and they want to serve a product, right. Make revenue, right. So they don’t have to go to Nvidia. Nvidia is the best option but you know, it’d be really, really helpful is if I could, you know going back to the earlier part in this discussion is if the first year I get the compute up front and I don’t have to pay for the compute for the first year, right? Like I was mentioning, you know, the $10 billion for the.
So it’s like it’d be really good if I could do that because then I can for a full year I can do training, I can subsidize inference, I can do all these things that build up a user base and then I can, and then I can actually pay for. I have a year of a gigawatt to figure out a business model, right?
Whether that is serving free tokens and then implementing this purchasing right of you know, purchasing stuff for the free user or it’s. And so a lot of that is like almost no fee initially purchasing and then like slowly rising the fee over time, right? Or it’s hey, you know, I have to serve this model at worst gross margins or negative gross margins initially but then eventually I can serve it at positive gross margins because the models keep getting cheaper or it’s I train the next generation model that’s so much better than everyone else and then I’ll win all the business for that level of intelligence because I’m the only one with an 18 year old. You guys all have 14 year olds, right? Like you know, who are working for you.
So it’s like, you know, this is, they can do whatever they want with this allocation. It’s not allocation of capital per se, allocation of compute. They get to decide what they allocate that compute to and Nvidia is helping them by effectively front loading it if they can find a capital, you know, and that company’s like oh yeah, Nvidia’s backing this too. Oh you know, there’s all these other things. It’s much more reasonable for someone to say oh yeah, I’ll back, I’ll pay the capex because I know the first year is already going to be paid because you’ve got that investment from Nvidia. What about the next four years?
The Risk of Overbuilding
PATRICK O’SHAUGHNESSY: If you ask a bunch of investors who are students of economic cycles through history, like Carlota Perez type stuff, they’ll say that the concern is that every shortage is followed by a glut and we always overbuild on long lead time, big capex projects and you’ve got multi gigawatt, you know, power being installed, you’ve got all this crazy stuff in semiconductors and at some point, like it just gets overbuilt.
DYLAN PATEL: All the stuff we talked about earlier.
PATRICK O’SHAUGHNESSY: Feels like we’re not really close to that. Like, there’s so much demand.
DYLAN PATEL: If the models don’t improve, yes, we will overbuild, right? Like, it’s pretty simple. It’s like, yes, there will be supply chain things where switches from one supplier to another. And that’s a lot of the stuff that we like. Nitty gritty stuff we focus on.
At the end of the day, if the models don’t improve, we’re absolutely screwed. And in fact, the US in another year, if this lasts another year and then it happens, like the US economy will go into recession, like straight up because of this. And probably Taiwan as well. And probably Korea as well. Right? Because there’s so much buildup and revenue flowing through to us for this. Right.
But you know, when you look at these other things, like the bubbles of the past, some of them were just silly nonsense, right? Like tulips. Silly nonsense, Right. Crypto. Complete Ponzi scheme. Right. But then there’s other stuff that’s like, no, this was real, right? Like the UK spent like some absurd percentage of their GDP on railroads for like a decade.
PATRICK O’SHAUGHNESSY: 6% or something crazy.
The Scale of AI Investment and Historical Parallels
DYLAN PATEL: Yeah, we’re nowhere close to 6% of our GDP. Like, holy shit. But like, that was like, okay, there’s tangible, but it’s like, oh, well, we did overbuild because like, how many goods are there to transport? But like, also you must reduce. You must build these railroads to reduce the cost of transport so much. Because you have no clue when the demand stops and you’ve overbuilt. And because there’s 10 people trying to do it at once, you’re obviously going to overbuild at some point. Same thing with fiber.
A lot of the argument against this is like, well, no, but this time it’s the strongest balance sheets in the world. It’s the world’s most profitable companies. They can all pull the plug at any point. Yeah, Microsoft pulled the plug at one point before. They’re like, oh shit. No, no, plug it back in. Right? They recently plugged it back in. They’re like, oh wait, we’re starting. We’re restarting this. We’re going out into the market. We’re signing deals with Nebius for GPUs. Like, I don’t remember how big the deal was. It’s like 10 plus. Yeah, it’s like $19 billion for Nebius.
So it’s like, well, if they had just not pulled the plug on their data centers, they wouldn’t have had to do that and they wouldn’t have to pay those gross profit dollars to Nebius. Right. But you know, Nebius has made the bet that the demand is there and they were right. Yeah. And so, you know, when you think about this, it’s like what is the level of demand where it stops, right. If scaling laws continue, right. I mean, of course there’s an adoption curve, there’s pace, there’s realities with capital, there’s realities with supply chains. Things take time.
But if you like boil it down to it, it’s like your demand for 30 year old senior engineers at Google who know how to make and program anything is effectively like, I don’t want to say infinite, but it’s $2 trillion of value. Yeah, right. If I could have an intelligence as smart as a Google Senior engineer, that’s $2 trillion of software value. Right. Because that’s how much the world pays to software engineers today.
And you just go down the list of every other use case, right? If you have just a simple, you know, physical intelligence robot that can do this, that can recognize headphone versus water and pick up or versus phone, right. And pick up the right thing and manipulate it properly and put it in the right spot and sort it, that’s worth how much to the distribution supply chain? Right. Like I don’t know, but a lot, right.
So it’s like there’s, you know, we don’t need to get digital God for there to be immense value. But the interesting thing here is that, you know, it’s like human capital, capital goods. All of these other revolutions have been capital goods that reduce the amount of human capital you need, whereas this is just creating human capital, right? In a sense. In a sense, right.
So I sort of get everyone bowled up, right. And we’re on this podcast, right? You know, there’s this, I don’t know if you’ve heard the curse, right? It’s like if you talk about the stock on this podcast, it goes down, right?
PATRICK O’SHAUGHNESSY: I’ve heard word of it.
DYLAN PATEL: We’re popping the bubble right now because the limit of AI is infinite.
PATRICK O’SHAUGHNESSY: For the record, we went and did the math one time because I was sick of hearing about this curse. And it’s just market performance. It’s not.
DYLAN PATEL: Oh really? So, so, so I. Last time I was, it wasn’t your, it wasn’t this podcast, it’s your other podcast. I talked about Applied Materials and the stock was up like 70% the six months after.
PATRICK O’SHAUGHNESSY: There you go.
DYLAN PATEL: Yeah, I broke the curve. I was like hell yeah.
The Middle Layer: Neo Clouds and Inference Providers
PATRICK O’SHAUGHNESSY: What do you think about all the companies in the middle? We talked a lot about Nvidia and then like people at the and serving applications. What about these companies like Together and Base 10 and Fireworks and you mentioned Nebius, like all these interesting middle layer players. Are there amazing businesses to be built there, do you think? Are they temporary patchwork to make the system work and serve, you know, serve demand. Like what do you think about this middle layer?
DYLAN PATEL: The cloud business model. Right, like let’s say neo cloud business model. So there’s, you mentioned inference providers and neo clouds. The neo cloud business model is absolutely amazing or terrible depending on how you do it. Right. It’s terrible if you sign short term contracts and you just hope and pray you have short term contracts forever. And actually initially your short term profits have amazing cash flows. Right? Because you’re selling it at like you bought a GPU and you put in a data center and the power and all that.
The cost per hour over a six year period for Blackwell is $2. Let’s just call it for simplicity’s sake it’s $2. It’s not exactly that, but if I sold it for six months I could get like north of, I could get like $3.50 or $4. Holy shit. That margin’s insane. But what happens two years from now, three years from now when I’m still selling six month contracts or one month contracts and the next generation of Nvidia chip is out and it’s 10x faster for 3x the cost. Right, okay, so now you know, naturally the price of this should tank.
The other way to do it is hey, I actually have a long term contract. Well I’m selling to OpenAI. I’m selling to Microsoft. The other end of the spectrum is what Nebius just signed. I’m signing $19 billion to Microsoft.
PATRICK O’SHAUGHNESSY: They will save me no matter whatever.
DYLAN PATEL: Yeah. The market literally believes Microsoft will pay its obligations before the US government because it’s literally a cheaper bond rate. Right? Which is insane to me, but whatever. This $19 billion has a huge gross profit because the price per hour and it’s not exactly $3 and it’s not $2. But the margins here are really good. Nebius is going to make at least $6 billion of gross profit off of this and then obviously have their operational costs but $6 billion of gross profit off of this deal is insane.
So I would do that all day. And CoreWeave did, until Microsoft stopped going to CoreWeave. Right. But CoreWeave turned around and they found other customers and all these things. Right? Selling to Google and selling to OpenAI. Oh, but now OpenAI is definitely not like a real, you know, you can’t rely on their balance sheet. Okay. I still have amazing margins when I sell to OpenAI, but they don’t have a balance sheet, so how can I be sure that they’re actually going to pay the thing that they’ve signed up to? Right?
So I know in theory this contract is worth a ton of money. And CoreWeave’s books today are all the contracts they’ve signed are mostly Microsoft, mostly money in the bank. But the OpenAI contracts, what if they can’t afford to pay for this? Okay. Now there’s a bigger risk and there’s a longer and longer tail of these businesses.
So yeah, you absolutely can make a ton of money. You know, there have been more recent deals with crypto miners, Google and Fluidstack. Because Google’s really short on data center capacity. People want to use more TPUs. They can’t serve them all themselves. So they’re going to sell TPU systems to providers, you know. Yeah, they’re backstopping the deals with Terawulf is one of the companies. I can’t remember the other one, but there’s two companies they’ve signed deals with where they’re backstopping the data center, plus selling the TPUs physically to another company. And then they’re being deployed and then they’re getting rented and Google still makes all the money, but you know, there’s those companies, you know. Yeah, that’s great as well.
But then there’s a long tail of is the enterprise demand? Who’s taking the risk, right? And it’s like OpenAI is taking the risk because they’re betting their entire company could go bankrupt if it doesn’t come. Oracle is taking a risk because they’re signing up for $300 billion of contract, okay. $200 billion of hardware spend across data centers and chips. And of that, they’re going to have to go get debt, right? So they’re on the hook and they’ll probably be able to pay for it if it happens, but they’ll just be like, their EV will tank if OpenAI can’t pay for all the hardware that they bought, right? And luckily for them it phases in over time and whatever. Right?
But then you go to the inference providers and it’s like, yeah, there’s a business to be made here too, right? I’m serving models. Maybe Roblox comes to me and they want to put an LLM in their game, right? Because of XYZ reason. Okay. Roblox is a good customer. Like, hey, you know, another company like Shopify wants to put an LLM for customer service. And you know, yes, they could do it themselves, but actually inference is a hard thing, especially as you get to larger and larger models and more complicated models and all the things.
Or you know, there’s all these different use cases where people want to serve models and maybe it’s just open source models and maybe it’s fine tuning of those open source models which those companies can help you do or you can do and they can serve for you and they have scalable, reliable capacity. There’s businesses to be made here, but there’s also like yolo. I’m selling tokens to random people who are trying to build SaaS apps in SF and maybe they run out of runway, right? And okay, that funding doesn’t directly go to Nvidia, but you go through some steps and it’s going to Nvidia after some value chain and Nvidia is holding no risk. Everyone in the middle has got a lot of risk.
AI Applications: Beyond Skeuomorphic Design
PATRICK O’SHAUGHNESSY: I’d love to hear your thoughts on going back to the other side of the equation, the app side, you know, stuff we’re going to use these models to do at the significance of this switch from deterministic code to a much different thing. And it seems like what we’re doing is the thing we always do. You know, Apple used to call this the skeuomorphic era where you just basically use the new technology to do the old thing you used to do. So we’re making engineers better. You know, that would be an obvious current example though. It seems like we haven’t yet gotten into the world where we’re going to start using this technology to do things that we couldn’t do before with deterministic code. I’m curious how you think about that side of pushing the envelope.
DYLAN PATEL: Why is that? I feel like that’s exactly what we do with it, right? Is the cost to develop things is so high that you can’t do it, right? Or the cost to have someone go buy stuff for you. It’s like, okay, great, you might have an executive assistant and you can tell them to do this. But the vast majority of people don’t. And now GPT is on the cusp of doing that, right? Go buy, go do this. Go buy this for me. And they’ll find the best thing and they’ll buy it, right? And you just trust them enough, right? It takes time to trust them.
But it’s like these things are proliferating across, you know, the massive, you know, tech. Tech is the most deflationary thing in the world ever, right? In terms of quality of life. It gets cheaper way faster than the revenues go up, right? But the revenue still go up. That’s sort of the fundamental basis of semiconductors, of tech, everything, right?
Are we doing things that we can’t do before with tech? With AI? Sure. I mean, the COVID vaccine was created with AI. It was AI drug discovery. There’s entire briefs about how it was done with AI. And guess what, if another pandemic happened, I bet it’d be even faster to discover the vaccine if there’s a vaccine for it or whatever, right? There’s all these protein folding things, there’s all these optimization things. There’s AI for material science and AI for all these other aspects of society. There’s optimization. Maybe it’s not in your face, right? It’s like, oh my God, I just made this drug, right? It’s like, no, I mean, I worked with the researchers who made the COVID vaccine and so we didn’t have to all be stuck inside forever or whatever, right?
PATRICK O’SHAUGHNESSY: Point being, it’s already happening.
DYLAN PATEL: And the whole use the new thing to make the old thing faster. It’s like, sure. But if I go back three years, how many people would it have taken to deploy an image recognition model that looks at every data center in the world and looks at what’s the pace of construction and what equipment they have.
PATRICK O’SHAUGHNESSY: And assuming this is something you.
# Power Infrastructure and AI Data Centers
DYLAN PATEL: This is something we do, right? It’s like how many people would that have taken? I don’t think it would have been possible. My business model—this is the second highest revenue product for us—would not have been possible if it weren’t for AI, like vibe coding, being able to dig through permits and regulatory filings, being able to run image recognition on satellite photos. This would not be possible. This business is not possible without AI.
Am I using it directly? Sure, I’m scraping through the regulatory filings and permits with LLMs and then manually reviewing it with people. Or doing that the same with the images, satellite images. There’s a lot of stuff that the image recognition model does. We just also look at them a lot and then it’s like compiling them and selling a spreadsheet that you get like bi-weekly reports on all the data centers or what’s changed. Or like, “Hey, actually this Amazon data center, the fans are starting to spin. So actually there’s revenue going on from this Amazon data center.” So we can forecast Amazon’s revenue, right?
I don’t think this would have been possible just a few years ago. At least like off the proof right now. And especially there’s demand for it because everyone wants to track this and it’s so important, but it’s like it begets each other. And I think in my daily life, I don’t think I could have taken that step from where I was in a business which was still a research provider. But that is a monumental jump. And being able to do it with three people out of the gate versus like 50 or 100—I don’t know how many people it would have taken, but I don’t think it’s possible.
Mainframe migration is something people have always wanted to do. Amazon leaving Oracle took 20 years, right? And they wanted to do it 20 years ago. And they had their highest revenue products after EC2—the next four were database products at AWS. And yet they still used Oracle’s database because it’s hard. Now there’s like mainframe migration can be way faster. Or migration from one tech stack to another can be way faster. You can make your business more efficient, you add more automation.
Yes, the tech exists. Go to all the businesses around the world and it’s like they aren’t using the leading edge of what they could. They aren’t using what a 2020 company could have done without AI, right? No one is doing that. And if they did, they’d be so much more efficient, right? But all of these things just take too long to build. They’re too expensive to build. You have your existing processes. How do you hand them over? How do you switch them over? How do you teach people to do this? AIs can help you with all of this, right? So it’s sort of like you can take the pessimistic view of like, “Oh, we’re just doing the same things.” But it’s like the value here is humongous.
The Power Challenge
PATRICK O’SHAUGHNESSY: If it’s tokens on one end, we haven’t talked much about watts at the very beginning and power. What are your thoughts on what is going on here and how humanity is responding to this crazy new demand for just raw power?
DYLAN PATEL: The first approximation is that we’re being a bunch of pansies and it’s not that much power yet, right? Data centers are like 3 to 4% of the US economy power—not economy or just data centers, period. Of that, like two is regular data centers and two is AI data centers. That’s nothing, dude. That’s literally nothing.
It’s just we haven’t built power in like 40 years, right? Or we’ve transitioned from coal to natural gas more and more over 40 years. So mostly we just don’t know how to. And there’s these regulations and there’s not enough labor and the supply chains for turbines and their dual combine cycle gas reactors are not there yet. And same for Mitsubishi. This random UV curing process for transformer coils—there’s only this much capacity and it takes two years to build them. It’s just a supply chain thing. It’s a lack of labor thing. It’s not that it’s actually that much yet.
But at the end of the day, it’s like, “Okay, wait, you’re telling me OpenAI is making a data center with 2 gigawatts and that’s like the entirety of the power consumption of Philadelphia?” That’s insane. That’s insane, right? But we used to get excited about finding a couple hundred megawatts new data center. Now it’s like, if it’s not a gigawatt—I remember the guy who leads that team, he was like, “Oh, it’s just 500 megawatts. Whatever.” I immediately opined, I also agreed immediately. Then afterwards I was like, “Wait a second, dude, that’s a lot of power.” That’s like—wait, 500 megawatts is $25 billion of capex once you put in the GPUs and everything, right? That’s a ton of money. But snore, because there’s so many of it happen, right?
Rebuilding Power Infrastructure
We’re learning how to build power again, right? We’re getting the supply chains to do it again. We’re reshaping the grid. There’s all these challenges with these AI data centers with regards to demand response and making grids unstable, right? AI workloads because they change so much so fast, especially training, you can just blow—you can just cause brownouts or blackouts. Especially if the grid doesn’t have enough inertia or if you’re not putting enough things to dampen it in between the workload and the grid.
And even if it’s not destroying it, the grid runs at like 59 hertz or whatever, right? If you skew it up and down too much, these transient power responses, your refrigerator will break down sooner. The motors in it and you might not even know it because the data center is nearby. So there’s all these things. There’s so many third order effects here with AI data centers.
But the funnest one is just that we’re building power, right? And it’s like whether it’s gas, which is a lot of it, whether it’s through efficient dual combine cycle reactors, or it’s random generators that are not nearly as efficient single cycle, or even worse, diesel generators. There’s a company that’s putting a bunch of truck engines in parallel—diesel truck engines. Because the capacity, the industrial capacity for diesel truck engines is huge. And no one’s tapped it yet. So why don’t we just put a ton of them in parallel and create this power generation thing right here, right? And then you’re generating power with a bunch of diesel truck engines in parallel and then you’re able to power a data center, right?
It’s like, “Okay, great, because I can’t get turbines,” right? There’s all these crazy things people are doing. Elon buying some power equipment from Poland and shipping it to America because he needed that power equipment. But whatever, couldn’t get it here because the supply chains were weird. “I’ll just get it over there.” Any lack of capacity in the supply chain is being eaten up immediately. And then everyone’s like, “Okay, let’s invest.” So GE’s like, “I’m going to double my turbine production.” Holy crap. Okay, that’s awesome. And Mitsubishi is doing the same thing.
You move, you go down the list. It’s like my transformer supply chain is expanding like crazy. And they’re fully sold out. So I’m going to go to the Korean guys and that’s fully sold out. So I’m going to figure out how to get the Chinese stuff in, even though it’s not exactly what people want to do, right? There’s all these weird things.
Labor Shortages and Supply Chain Quirks
Electrician wages have doubled for mobile electricians that can work on data center stuff or rather contract. If you’re down to move to West Texas, it’s like 2015 again and being a fracking guy, right? You don’t need to be super duper skilled. You can go to West Texas and make a ton of money off of fracking, but there’s not enough of those people. That’s why, right? If there were enough electricians in West Texas, if there were enough electricians in America, we could build these data centers faster.
So there’s all these little supply chain quirks and weirdities. And everyone’s supply chain is different because the way Google makes their data centers different from the way Vantage makes their data center, which is different from the way that EdgeConnects makes their data center, which is different from the way QTS makes their data centers, which is different from the way Amazon makes their data centers. So their supply chains are not exactly the same. And so you get all these weirdnesses in all these different supply chains.
No one really knows it because everyone who tracked the supply chain or knew it—you go talk to power people. It’s like on one end of the spectrum is Dario. And then you take a few steps and it’s like the average ML researcher, then it’s like me and then it’s like you in terms of how bullish we are on AI. And the guy at the power utility is over here, right? There’s a few more people. There’s the standard New York stock investor, semis investor. Then there’s the New York not semis investor. And then there’s the Sequoia guy who thinks AI has been a bubble since 2023. And then there’s this utility guy, right?
This utility guy is like, “I’m not building power. Power doesn’t go up.” And then you have the regulations around it. It’s like, “How can I build a data center in this density?” Because, “Okay, well then, great. I’ll build the data center in this density. I’ll have all this backup generators.” Great. Now all of a sudden the grid’s like, “Yeah.”
Grid Regulations and Demand Response
So this has happened in Texas, ERCOT, and it’s happening in PJM, which is the main northeast kind of area. These two grids are putting these rules where, “Hey, we’re going to actually say, hey, big loads. We can tell you 24 hours or 72 hours beforehand. We’re going to cut off half your power so that we can—”
PATRICK O’SHAUGHNESSY: Because we need it for something.
DYLAN PATEL: Yeah, people need to have their homes powered. We’re not like Taiwan, where if we’re in a drought, we limit people’s power use, water usage and not the fab. Right. Which is a real story. Right. I think there was like 2022, 2021-ish. There was multiple cities where they were like, “Okay, yeah, we’re going to limit the showers you can take to three a day or three a week.” Which is fine because they’re East Asian and they have—they don’t have the smelly gene. If you did this in India, it’d be cooked. It’s already cooked. But they’ll limit the water to these people before they’ll limit the water to TSMC because—and it makes sense. The economic value of TSMC is way above the economic value of people showering three times a week.
But the US grid is not going to work that way. We’re not that authoritarian or people have more say. So anyways, in Texas and in PJM, you can cut half the power if you give them a notice. And if you do that, then you need to turn on the generators that are there on the site. It’s often diesel generators. Maybe it’s gas, maybe it’s hydrogen stuff. There’s all sorts of weird stuff people try to do just to ramp up power for that period of time.
But then all of a sudden, “Oh crap, the density of my generators means that I fail the air permit if I run the generators for more than eight hours a month. So now what do I do?” Right? There’s all these weird regulations. Even if it’s Texas, it’s really—
PATRICK O’SHAUGHNESSY: Fun that we get to watch it.
DYLAN PATEL: You get to watch it and then watch the supply chain and try and, at least from my perspective, provide the data so people can trade on or provide the data so people can adjust their supply chains, industry-wise, right? People who go to your audience, they can trade on it or they can say and invest and make money and allocate capital more efficiently. Right?
US vs. China: The AI Stack Comparison
PATRICK O’SHAUGHNESSY: If I were to line up all the stages of this between the US and China—so you know, power, semis, models, applications, etc.—where do you think the most interesting differences are? What are the storylines between US and China at those various layers of the AI stack that are the most interesting to you?
China’s Strategic Position and the AI Race
DYLAN PATEL: When you look at China, they are a very formidable competitor. I think if we didn’t have the AI boom, the US probably would be behind China and no longer the world hegemon by the end of the decade, if not sooner. And a world where the US is not the hegemon is a bad one for Americans at least. You know, I’m sort of like a bald eagle carrying American.
Without AI, we’re definitely just going to lose, right? Our supply chains are slower, they cost too much, we’re sliding. Our debt is unsustainable. Our economy is not growing fast enough to maintain the level of debt. We’re over-consuming relative to what we produce. The financialization. There’s all this dearth in the US in terms of social instability, partially because of income inequality, but also largely because of the visual nature of income equality and the tendency of people to flaunt their wealth more because of social media and how that hacks people’s brains.
And then also, because the algorithm serves people different content, we’re drifting further and further apart in culture, right? Monoculture of everyone watching the same movies in the 50s and 40s and 30s versus now. You and I are pretty similar and our feeds are completely different. So think about someone who’s not in our world. Their feed is insanely different.
The Critical Role of AI in US Economic Growth
I think the US would literally fall apart if we don’t do something. And by do something, I mean AI has to dramatically accelerate GDP growth. Once you start talking about dividing the pie, you’re screwed, right? It has to be growing the pie. The US really, really needs AI.
China’s view is a little bit different, right? They don’t necessarily need AI to win. They’ve always played this long game. They did it with steel, they’ve done it with rare earth minerals, they’ve done it with solar panels, they’ve done it for producing phones, they’ve done it for PCBs, they’ve done it for so many industries incrementally. They’re just going to continue to do that and they’re going to win because they work harder and they’re on average smarter.
If we don’t have super powerful AI systems, we’ll run out of easily accessible nickel and cobalt and oil and natural gas, and we won’t be able to make solar panels efficient and fast enough and everything will start to get more expensive and the pies will reduce and we’ll also tear each other apart in that way. So I have a very pessimistic view that if we don’t accelerate, we die. If that’s your worldview, then we really need to win AI.
China’s Approach to AI Dominance
China’s worldview is they want to be the world hegemon. I mean, who doesn’t want to be the world hegemon? But there’s only two countries in the world that can legitimately do it and legitimately are trying, right? The US and China.
The way the Chinese AI ecosystem thinks about this is, well, we don’t necessarily need to have the biggest compute cluster when OpenAI is trying to make a 2 gigawatt data center full of GB200s and GB300s and all these different chips. And those chips are way faster than the chips that will sell China, chips China can make themselves. And China’s deploying less of them. The dearth of compute is huge.
We’re kind of doing what China’s done historically, just dumping tons of capital into something and the market becomes interesting and the beneficiary is like, oh, if OpenAI has 800 million users today, when they have 3, 4 or 5 billion users across the world, which is possible, of ChatGPT and whatever applications they come up with, then they’re on our system and then they can start to make money. Right. It’s sort of like YouTube lost money forever, but now it’s the platform for watching videos across the world. And ChatGPT will be the same thing. So there’s that aggregation theory.
China’s Semiconductor Investment Strategy
China doesn’t necessarily think of it the same way, but they are still incredibly pilled on, “Well, we want to be able to make everything ourselves.” Make all of the chips ourselves. We’re not necessarily, we don’t actually care that much about making all the chips ourselves. Sure, Trump’s doing the tariff. Sure, we have the CHIPS Act. Those were drops in the bucket compared to how much money China’s releasing into the semiconductor ecosystem and has been for the last 10 years.
They’ve dumped at least like four or five hundred billion dollars into the ecosystem through SOEs.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: Through certain tax policies, through certain land grants, through provincial governments, through the big funds, which is like government venture funds almost. So they’ve dumped so much more capital into semiconductors than we have in an unprofitable way because they want to build that ecosystem.
And over time, if you take any country in isolation, China is the one that has everything at the highest level on average. Right. Sure, they’re like 30 years behind on jet engines, but they don’t actually need to go outside of China for any of the materials besides raw materials. Whereas the US needs titanium from here and blah, blah, blah from there. Right. And the same applies to their semiconductor ecosystem.
Sure, the US and Taiwan and Korea are way ahead, but then they also have the accumulated capital base of all of the existing equipment and all of the existing fabs. But then they don’t have, they need to import from all these different places because it’s a global supply chain.
China’s Focus on Supply Chain Independence
And so China is much more concerned today about being insular than being the best in doing this sort of aggregation theory. But because they’re so talented and they have an insular supply chain, yes, they purchase some stuff from the foreign world. They rent stuff. They have ByteDance, who’s I think the third largest user of GPUs in the world after OpenAI and probably Meta, although ByteDance may be bigger than Meta, the third largest user of GPUs in the world, or second, maybe even ByteDance is.
They have all the other major Chinese tech companies. They have all of these amazing graduates. They don’t have a talent war. Companies don’t poach from each other. Right. DeepSeek engineers make a lot more than other engineers, but they’re not making ten million dollars, even though it may be worth it. There’s this real big perception difference.
And China could build way faster than us. If they wanted to build a 2 gigawatt or 5 gigawatt data center, they could probably smuggle a lot of chips. It’s not like a pure derivative of them wanting to smuggle loads of chips. Because hey, if they wanted to build a 10 gigawatt data center, I bet they could build it in a few years. Whereas the US is not going to build a single 10 gigawatt data center for a while. The total capacity of an OpenAI will be like 10 gigawatts in a few years.
Optimistically, they don’t have the best chips. Huawei’s speeding up, trying to get better and better and faster and faster. They don’t have the best memory. They’re trying to get better and faster there. They do have the most power. They can build stuff way faster. We’re impressed at how fast Elon does stuff. Elon’s slow compared to China. And I think he knows that, which is why he’s maybe the one who’s actually using the Chinese ecosystem more in terms of the battery facilities making in China and all these things. He probably recognizes it too.
Contrasting Capital Allocation Strategies
So there’s these major differences in viewpoint and approach because China wants an insular supply chain. They want to have supply chain security. We talk about wanting that, but we don’t actually put the money behind it. Where’s the Russian roulette of, where’s the American, or the slot machine of where the American capital is being allocated? It’s building the biggest data centers, it’s training the best models. Whereas in China, the capital is being allocated into growing the EV supply chain, growing semiconductor supply chain, catching up in all these areas. And the US, sure, we want to catch up, but actually we’re just going to give like three…
PATRICK O’SHAUGHNESSY: Maybe Jensen was right that what you want to own is the end customer thing and export to…
DYLAN PATEL: Production and import to the product.
PATRICK O’SHAUGHNESSY: They’re doing the same thing they’ve done forever, which is prepare at the base level and be behind at the customer side and the value happens close to the customer.
The Taiwan Risk Factor
DYLAN PATEL: But then you get to the point of, okay, well what happens in three, four years? Even if the US AI is amazing, we have no… the doomsday scenario of China decides to blockade Taiwan or even invade it or create political instability. People talk about Cambridge Analytica and Russian trolls, whatever. China could do a billion times that into Taiwan, especially with AI, with how good AI is now and somehow subvert it or coup or blockade or whatever and we no longer have Taiwan. US economy kind of free falls, right? Because we can’t make refrigerators without Taiwanese chips. We can’t make cars. We can’t make AI data centers. We can’t grow any of the cloud. We can’t. That means we can’t deploy any more SaaS applications. What the hell can we do?
PATRICK O’SHAUGHNESSY: Back to going to acquire all the talent, get them over here, do that.
DYLAN PATEL: Right? I think that’s sort of the catch-22 of this all. If you push China too hard, they totally will. They’re going to start swinging. They have the talent. They could go crazy. They could, if we no longer have Taiwan, actually China could build a way bigger cluster than us. And if compute is all that matters, right? They could do all of these things and they own the means of production for everything, right?
So there’s this challenging aspect of geopolitical risk. That’s why people don’t want to invest in TSMC. But it’s almost like you can’t invest in Amazon or Apple or Google or Microsoft if you believe Taiwan has risk. And so it’s like YOLO, invest in TSMC. I know a lot of people’s PMs are like, “Oh, you can’t invest in TSMC because geopolitical risk.” And it’s like, no, dude, you can’t invest in Apple.
Perspectives on AI Skepticism
PATRICK O’SHAUGHNESSY: Who is your favorite AI bear? Someone that is far distant from you on just their perspective on the direction of this whole thing that you nonetheless like and respect.
DYLAN PATEL: There’s some of the AI researcher gods like Yann LeCun and these kind of people who are AI bears. I respect them. I like their ideas. I think they’re completely wrong.
PATRICK O’SHAUGHNESSY: And their argument is what? If you had the…
DYLAN PATEL: The ways we’re doing this won’t work. Right.
PATRICK O’SHAUGHNESSY: LLMs won’t scale or…
DYLAN PATEL: And it’s like, okay, yeah, autoregressive pre-training on the Internet doesn’t work to get you to AGI. So he’s completely right on that. But then he’ll turn around and be like, “Well, no, no, no, but RL systems and all these things are not the right way either.” Right. It’s sort of the “no buts.”
I think there’s also some investors that I know who think this is bullshit, but they’re just making tons of money on it anyways.
PATRICK O’SHAUGHNESSY: I think it’s bullshit in what sense?
DYLAN PATEL: Like it’s an overspend and this is a dumb way to do it. And like, yes, I bought Oracle before earnings because we see these deals happening, whether it’s through our data or some other means. They saw these deals happening with OpenAI but we don’t think OpenAI can pay for it. But we know the market’s perception will be this and therefore the stock will go up and so we’ll own it. Right.
And so there’s, I guess I would respect them to some sense, but I think more and more the level of evidence that’s there that this stuff is going to get super powerful, it’s hard to… Not right again, this AI bubble is going to pop because it’s podcast.
PATRICK O’SHAUGHNESSY: I assure you it’s just the market return. It’s a coin toss. What startups interest you the most?
The Promise of AI in Materials Science
DYLAN PATEL: So one of the startups is that it’s the most recent investment I’ve made. It’s called Periodic Labs. It’s mostly OpenAI people. It’s a Google guy and a couple material scientists. The area of AI that we’ve all been talking about is like large scale web training, RL, all text, all digital God, right? We want to make digital God.
But you know what would drive a shitload of value for the economy besides automating programming of everything? It’s like if we just came up with a battery chemistry that was like 25% more efficient. Like, holy shit. The main cap against us all having face glasses, things like that, is like batteries are not good enough, right? And the power dissipation, but the batteries look terrible. So you have to make all these compromises.
But if I could have the processing power of a laptop on my face, we’d be way further ahead. And then if we all had these super powerful machines attached to our face, we could do inference on things and recognize and interact with the AI at much higher speed and velocity. And that would dramatically improve our productivity, right? Things like this are so gated by hard tech moving faster.
And so what Periodic is trying to do is they’re taking this RL paradigm, but they’re trying to do it with the real world, right? Test chemistry for something. Here’s a chemistry, here’s an optimization, here’s something that the model spit out, but then you also want to test it in the real world and then feed that back into the model. And so you do this chain of circles, right?
But instead of purely being digital, which is why RL is really hard because you need to generate a bunch of responses, test, and then train the model. So the flywheel is so freaking fast. The flywheel in the physical world is so slow, right? Oh, you mean I need to make a chemistry, I need to try this, I need to test the thing, I need to input it back in and keep calibrating and keep doing this. It’s so much more expensive, it’s so much harder to do. But actually there’s a ton of low hanging fruit there, I bet.
PATRICK O’SHAUGHNESSY: What about in the hardware world? Just in the pure hardware space, attacking some other interesting bottleneck.
Hardware Innovation and Supply Chain Opportunities
DYLAN PATEL: When you talk about where we are in tech, right? Semiconductor manufacturing is super space age. It’s the most complicated tools we make in the world, including tools that cost like half a billion dollars. And they’re super amazing feats of engineering. Then the software behind them all is really shit, right? So you could accelerate all that.
But really, it’s like in the hardware world, the biggest challenge is that I’m not really a big bull on the accelerator companies. I’ve never been on companies competing with Nvidia, competing with Nvidia, with TPUs, with Trainium, with AMD. Not a big bull on those kinds of companies.
PATRICK O’SHAUGHNESSY: Because it’s too hard?
DYLAN PATEL: There’s just too many things to do. It’s too capital intensive. There’s not enough of a revolutionary leap. There’s too many predicated things. I wish it could happen. It’d be fun. Maybe it does happen, but it would take hell of a badass thing.
But I think there’s a lot of individual parts of the supply chain which are not space aged. Nvidia space age? Yes, it’s the biggest value owner today. But their supply chain has so much old shit. Whether it’s their supply chain or the hyperscaler supply chain, transformers have not changed in like 50, 100 years.
PATRICK O’SHAUGHNESSY: There’s a guy building one.
DYLAN PATEL: A company in that space, solid state transformers. Things like this. So there’s all sorts of interesting things there. There’s so many interesting companies in that space because there’s so much innovation to be done and there wasn’t that much of a need to do innovation before.
Networking and Memory Challenges
Another area is networking between chips because as we extend context length, the memory requirements become bigger and bigger. And yes, new memory technologies would be awesome, but DRAM as an industry has so much invested capital goods, so much existing factories, it’s really hard to attack. But networking is less so. And there’s more breakthroughs that can be done in networking that, okay, maybe you don’t have better memory technologies, but you’ve tied the chips closer together so you can use each other’s memory on the problem.
There’s so much more that you can do in terms of the optics space, bridging the gap between electrical connectivity and optical connectivity. Because Nvidia created Blackwell. They had a ton of manufacturing problems and challenges with it for their supply chain. Balance sheets went up for various companies in the supply chain that were building servers and stuff because they’re trying to figure it out. AI server, AI data center deployments were slowed because of these challenges.
There’s reliability challenges because these things are connecting to each other at absurd bandwidths. Every chip in the rack can connect to every other chip in the rack at 1.8 terabytes a second. If you think about how much data that is, the amount of bandwidth is so high for connecting these chips together, you can’t fathom what a terabyte a second is. You can’t fathom what a gigabyte a second is.
It’s like okay, a gigabyte a second is like a video, right? Or less than a video. Or like a megabyte a second. But actually that’s a million bits of information. What’s a kilobyte a second? A byte a second. Okay, you can understand what a byte a second is because that’s eight bits. Okay, I’m transmitting eight bits to you back and forth every second. That’s pretty fast. That’s what it used to exist. And it’s like where we are now, there’s still tons of innovation left to be done there.
Data Sharing and World Models
I think part of the reason Intel is behind is also that data sharing internally was terrible. And just within the fab, the lithography team doesn’t want to share their data with the Etch team and they can’t use and that data can’t leave the fab and go to an AWS data center to run correlations and all these other things. So you don’t learn from the experiments you do fast enough.
Now TSMC is not perfect here either. They won’t send their data to a cloud either. But this experimentation, experiment, analyze the data, figure out the new experiments cycle is slow and how you break that is actually partially it’s changing these companies’ culture which I think Lip-Bu Tan is trying to do. But also it’s a lot of it is building better simulators and simulating the world more accurately.
So world models generally are like hey, I’m going to simulate the world I’m going to walk around in. The common one I think is Genie 2 that Google made, where you can walk around in a game state and you walk around the world and you can see cars driving. But it’s actually what a world model could also be is just simulating molecules or simulating but not through classical methods. It’s not computational fluid dynamics, it’s the model experiencing this enough and then running and training a model on physics and then feeding that back through and doing it through an AI method instead of.
And so world models can be doing any sorts of things, right? You can make a world model to train robots how to pick up cups, or you can make a world model that is simulating some chemistry and a chemical reaction, or a fire. You could do all sorts of different things.
So there’s a lot of world model companies out there, some of them are really interesting, especially when they’re targeting the physics and reality of the world. Most of the cool innovation is just happening at big companies or already existing companies. That’s just the nature of it all. Actually TSMC is doing the most cool innovation and Nvidia is doing the most cool innovation and Amphenol is doing cool innovation. It’s like all these companies are doing cool innovation.
Speed Round: Company Impressions
PATRICK O’SHAUGHNESSY: Could we do a quick speed round where I say some company and you just give me a sentence or two on your impression of them? Just how you feel about them in this moment.
DYLAN PATEL: Yeah.
PATRICK O’SHAUGHNESSY: Start with OpenAI.
DYLAN PATEL: Oh yeah, super awesome.
PATRICK O’SHAUGHNESSY: That’s it?
DYLAN PATEL: I mean we’ve talked about them all day.
PATRICK O’SHAUGHNESSY: Anthropic.
DYLAN PATEL: I’m actually more optimistic on Anthropic than I am on OpenAI.
PATRICK O’SHAUGHNESSY: Why?
DYLAN PATEL: Their revenue is accelerating way faster because what they’re focused on is more relevant to that $2 trillion software market versus OpenAI is split between. Yeah, they’re going to do that, but they’re also going to do these other things. But they’re also going to target AI for science and they’re going to also target AI for the consumer app and doing the take rate thing which all of these businesses could be amazing and OpenAI maybe executes on all of them, but Anthropic is definitely executing on the software side better.
AMD, I love them, but they’re pretty mid.
PATRICK O’SHAUGHNESSY: Why do you love them if they’re mid?
DYLAN PATEL: When you grow up building computers and liking computers and AMD is innovating and they’ve always fostered this underdog mentality against Intel and against Nvidia. Evil Intel and evil Nvidia, and AMD is the nice company that’s the underdog and they’ve always got the “oh, they’re going to take share from them” thesis. It’s hard not to love them.
And I know so many people there and I like all these major hardware companies. There’s not one that I don’t like in terms of the people but AMD’s got a soft spot because I think it was my first multibagger as well. My first multibagger. I can’t own stocks anymore because compliance, sorry for the rant, but I fucking love AMD. I also love Nvidia, but they’re in a real…
PATRICK O’SHAUGHNESSY: Mid, but xAI?
The Capital Challenge and Business Model Innovation
DYLAN PATEL: Danger of not being able to raise capital. Elon’s the best. Of course everyone’s going to give Elon capital, but the scale of capital required for him to keep up, he can get the next bet. He can get to Colossus too, right? This mega data center that he’s building, largest data center in the world when he built it. 300,000 Blackwells, 500,000 Blackwells. It’s going to be really great.
But if he doesn’t figure out a business model besides porn bot, which is what Annie is, which also I think he’s monetizing the wrong way. I think he could monetize it so much better. You’ve captured the zeitgeist with a cute anime girl that talks to you in a cute voice and will rizz you up. And you’ve got these users who actually fall for it and it’s not realistic enough yet, but it will slowly get more realistic.
And you’re selling outfits for the same price. You should make it a random, “hey, you have a chance to buy the outfit that is actually her being nude.” Or “hey, you have the chance to buy the outfit of her looking like this one anime girl from this one anime.” Or “hey, you have the chance to buy this outfit that’s her in a nun suit.”
I mean, obviously people at xAI hate this and a lot of them, many of them, some of them have left. But I think he has to figure out some business model beyond just this. Although I think this could be a big business, right? He should partner with OnlyFans and make manifestations of the OnlyFans creator that are Annie. And then he subsumes the OnlyFans platform into X, the Everything app and be like, “XXX.” You could just Trojan horse OnlyFans away.
Because the discovery mechanism for OnlyFans is Instagram and Twitter, as far as I understand, and you own one of them. And you could partner with the biggest OnlyFans creators to get them over, right? And then they don’t have to respond to all the losers. They can also just train a model that acts and looks like them and talks to them. There’s all these different monetization methods.
I don’t think that’s what he should only focus on, to be clear. xAI can get to the next stage of compute. They won’t have more compute than OpenAI. They don’t have more compute than any individual company at Google, Meta, et cetera. But they will have the biggest individual data center. And what he does with that, they’ll be up. They have a very focused team and what they do with that, they have to do something really big, otherwise they will fall behind in the race.
And Elon will not let that happen. He doesn’t want that to happen. But he can subsidize and fund this round, but he can’t go to a 3 gigawatt data center unless he gets capital, which he can’t do unless he gets revenue and fundraising.
Oracle’s Position in the AI Race
Oracle’s going to make so much money. If you believe OpenAI’s successful. But if you think OpenAI is going to be successful enough to pay $300 billion to them, how many users do they have and what’s that IP worth? Maybe. And also there’s reasons you shouldn’t own OpenAI, like the Microsoft stuff and the risks around Anthropic and all these things. But in most worlds where Oracle gets paid $300 billion by OpenAI, OpenAI is like a $10 trillion or $5 trillion company or something crazy.
PATRICK O’SHAUGHNESSY: We’ll end with the OGs, the old last generation, best two business models, first being Meta.
Meta’s Full Stack Advantage
DYLAN PATEL: I think Meta’s got the cards to potentially own it all. I don’t know if you’ve seen these new glasses that came out with the screen. As we go through the history of computing, you have, initially it was punch cards, programming, then it was DOS terminals, right? And then it was GUIs and mouses and keyboards. Then you had touch.
And the next paradigm, a human computer interface is we don’t actually have to touch it at all. We tell the AI what we want and the AI will translate that into reality, right? Whether it’s “hey, send an email to this person, send a text to this person.” That’s basic stuff that you can already do that with Siri or whatever, right? But “oh, go buy this.” We’re so close to all of these things.
The input method into a computer changing entirely. And the only company in the world who has the full stack from good hardware, that is what Meta just showed with their glasses, with the screen, plus the good models, plus the capacity to serve them, plus the knowledge and know-how around recommendation systems to know what content to put in front of the user.
Because it’s not just generating the content, it’s not just interpreting the user’s word and taking actions, it’s also putting the right content in front of the user. It’s all four of these that you need to put in front of the user plus the capital. Plus the capital. I think Meta is so close to being the only company that can do that, but there’s a lot of risks there too, right?
PATRICK O’SHAUGHNESSY: So I like Meta a lot. Google to finish it off.
Google’s Awakening
DYLAN PATEL: I was pretty bearish Google two years ago but I’m super bullish Google.
PATRICK O’SHAUGHNESSY: Why the change?
DYLAN PATEL: They’re waking up on every front. They’re taking the TPUs, they’re selling them externally, they’re taking their models and they’re actually competitive on them and they’re training much better and better and better. They’re being aggressive on infrastructure investments. There’s still a lot of dysfunction throughout the company.
But they do have the hardware business that they can pivot into this. They won’t be as ahead as Meta is, they won’t be as good as Apple is. But they do have Android, they do have YouTube, they do have all these IPs, they have search that can come together when we turn to that next interface of consumer.
But also they can also dominate the professional sense too potentially. Whereas Meta I don’t think can dominate that professional sense, only the consumer sense. And I think Google’s well positioned to go capture both markets or a meaningful share of both.
PATRICK O’SHAUGHNESSY: I feel like we’ve covered an incredible amount of ground. Is there anything that we haven’t talked about that you feel is really critical to what happens in the future that we didn’t cover?
The Changing Economics of Software
DYLAN PATEL: I think the question, sort of of everyone that I constantly get asked is “okay Dylan, you’re lucky. Your obsession is that you loved hardware and you followed it and you followed the supply chain and you built this business on it. But you really don’t follow the software side nearly as much and all the value is going to get created there, right? When is that going to, when is that flip coin going to flip over?”
But I think the thing that most people don’t realize, software is not the same as it was five, ten years ago. You’ve had dramatic changes in software and the business model is going to change as well. Right. If we go back five years, three years, whatever. When SaaS was the darling. November ’21, I remember SaaS started tanking and at the time it was mostly they were burning and all these other things. Doesn’t matter.
The interesting thing about the business model is that it is such a good business model. When your R&D is sort of this, it stays flat, right. And you grow a little bit. But really R&D doesn’t flex that much. Your COGS are super low. The flip side is in a SaaS business your customer acquisition cost is quite high.
PATRICK O’SHAUGHNESSY: Yeah.
DYLAN PATEL: And so when you look at what certain companies have done when they’ve acquired a business is they’ve just crushed the customer cost, acquisition cost or crushed SG&A. They made the business amazing. Whether it’s Broadcom with VMware and stuff, it’s not really customer acquisition. They just had a bunch of wasted SG&A.
But this SG&A, this customer acquisition that was most of your cost, R&D was small but not crazy. And then once you hit critical mass you just cloud cash, money, money, money, money, money. But software changes a lot when the cost to build that software that you have tanks like crazy.
You look at non-US markets and the prevalence of SaaS and it’s very different. I will bring up China as an example. On a counterpoint, China doesn’t have that much of a SaaS business actually. Their cloud business is pretty small too, right? Relatively to the US despite them importing tons of CPUs and storage historically. Most people just did stuff on-prem and designed their own software because the cost of developing software in China was so much less than America that the SaaS business model didn’t work as well.
People could just build rather than rent it out and buy. And that creates inefficiency in the market. I’m sure those aren’t the best of breed solutions always. Anyways that’s what the software development costs may be. It was software developers in 2015 in China were getting paid maybe one-fifth of the US and they were maybe twice as good or something like that. 10x lower cost of software. I’m making up numbers, right?
They had 10x lower cost of software and so SaaS never happened, cloud never happened. And at least as big of a way as it did in the US and around the world for all the companies that use that sort of, that have that same economic reality. And that’s despite the outsourcing, right? To India and Eastern Europe and South America, et cetera.
You changed all of this with software development, right? And AI SaaS products generally, right? Not just AI software development. So there’s two sort of coins here. So AI software development tanks the cost of building a competing software stack. Do you now move to a world where actually I can just build, I can just build instead of buying? Renting?
Two is if you are a SaaS business and your customer acquisition cost remains the same. And most businesses in AI and in SaaS are going to remain having a high customer acquisition cost. Sales is hard. Breaking into a competency is hard. But now you add this AI part of it, you’ve now added a humongous COGS, right? Your cost of goods sold in any AI software is really hard and really big.
And this is partially why I think Google also has an advantage. They have the lowest cost of goods sold for any token of any company because they have their own vertical stack on TPUs. But anyways, coming back to this, because you have this high customer acquisition cost and you have this high COGS and then the cost of anyone developing it themselves or competitors in the market means you’re going to have a very fragmented SaaS market or they’re just going to build it themselves and therefore you never hit the escape velocity where your customer acquisition costs and your R&D get amortized.
And because you have such a high COGS, your amortization point means your gross, your net profitability is actually much worse. And so I think the era of software-only businesses is really, really tough in the age of AI.
Now already scaled businesses can do great, right? I think YouTube is going to have its glory days and I’m sure it’ll always be amazing. But with the cost of generation of content falling and falling into creating content, he controls the platform is going to win and win and win and win.
But there’s the functionality you build within Salesforce is actually going to be way less what you can build on your own. Or there’s, or whatever it is. I’m not saying it’s a take on Salesforce itself specifically, but I think many software businesses will have a reckoning with the fact that their COGS is going to soar, their customer acquisition cost isn’t going to fall and they have a lot more competitors and so then they don’t hit that escape velocity.
And I think that’s the thing that maybe software, it’s something I’ve sort of thought about. There’s a couple people in my company. Douglas Olaflin, he’s the one whose idea this actually is.
PATRICK O’SHAUGHNESSY: This has been incredibly fun. I love, love learning from you and listening to you and reading what you put out. I think you’re just one of the most energetic and awesome thinkers in this whole space right now. So thank you for all the work you’ve done. When I do these, I ask the same traditional closing question. What’s the kindest thing that anyone’s ever done for you?
The Kindest Thing
DYLAN PATEL: Done for me? I mean, it’d have to be my brother. Everything he’s done in my life. I’ve been a shithead my whole life and I still am a shithead. And so every time he pulls me back on path, he corrects me. He loves me unconditionally. I think my brother is probably the most. He’s done the kindest things for me, right?
And I’ve been an asshole so much of my life, right? Inconsiderate and everything. He’s just always been there for me and always been.
PATRICK O’SHAUGHNESSY: Yeah, if you’re aware of it, it makes it interesting.
DYLAN PATEL: And maybe this is the mode of who I am and maybe that’s why I’m a good thinker, but I vibe really hard and I’m in the moment really hard and I digest tons of information. But I’m very bad at task orientation, remembering to do specific things. I’m very bad at those things.
And thankfully I’ve been able to surround myself in my life, whether it’s through birth or not, with people who help me with the things I’m bad at. Because I’m very bad at a lot of things. I think, as far as a radar plot of how good I am at things.
And so when I don’t call people or be considerate of what they’re thinking because I’m just vibing and I’m doing whatever, I’m focused in on this path and that path ends up hurting someone else, right? Whether it’s I didn’t call someone or I didn’t think about their feelings when I did an action or when I said something. But that makes me an asshole, right?
And yes, I should be more conscious of this. And I try to be but it’s just one of the things I’m going to wrestle with in my life forever. And a lot of times, I don’t even realize I’m being a freaking idiot until my brother’s like, “you’re a freaking idiot.” And so that’s the kindest thing anyone’s ever done for me, is my brother through my whole life.
PATRICK O’SHAUGHNESSY: I love it. Wonderful place to close. Thanks so much for your time.
DYLAN PATEL: Thank you so much.
Related Posts
- Transcript: Jocko Willink on Shawn Ryan Show (SRS #257)
- Transcript: Chris Williamson on Joe Rogan Podcast #2418
- Transcript: Why I Exposed Anti-Trump Bias At The BBC – David Chaudoir on TRIGGERnometry Podcast
- Tucker Puts Piers Morgan’s Views on Free Speech to the Ultimate Test – Tucker Carlson Show (Transcript)
- Transcript: How the Internet Is Breaking Our Brains: Sam Harris on Dr. Jordan B. Peterson Podcast