Full Transcript of Oculus CTO John Carmack’s keynote at Oculus Connect 2014 where he discusses the Gear VR and shares development stories…
Operator: Ladies and gentlemen, please welcome CTO Oculus John Carmack.
John Carmack – Oculus CTO
All right. So I don’t actually have a presentation but I can stand up here and talk about interesting things till they run me off the stage.
So mostly that’s going to be about Gear VR because that’s what I’ve spent most of my effort this last year on. So for the too-long into listening crowd, we will start with where we are today and then we will go into the history and the path it took us there which offers some insight for the current state of things.
So I believe pretty strongly in being very frank and open about flaws and the limitations. So this is kind of where I go off message a little bit from the standard PR plan and talk very frankly about things.
So the current killer limitations on Gear VR are the fact that it’s a 60 Hz low persistence display which has flicker problems for a lot of people and it has no positional tracking, which is one of the critical aspects for DK2 in future products.
So there are plans and mitigation strategies for what we can do around that now and how we want to improve that in the future. The 60 Hz low persistence turns out to be — it’s not as tragic as a lot of people were expecting it. There were a lot of people just like that’s completely unusable and certainly Oculus has been talking about minimum frequencies for low persistence displays. And a lot of people were surprised that it wasn’t as bad as they thought it was going to be.
They have a cork in them where they don’t blur so much but there is a two frame rise problem. And many of you have probably seen this where it’s a type of ghosting but it’s not smearing like a full persistence display. But if you see especially like a dark tree limb in tuscani and you move your head very rapidly you will see one ghost of it, offset by certain amount proportional of your head movement speed. And that’s a problem that gets worse as the color palette gets dimmer. The very dark colors have more of a problem with smear.
This is something that we dearly want Samsung to fix in future displays but it hasn’t been their top priority to address but it matters in VR. We have made mitigation strategies for that where you can de-ghost where it’s kind of a software overdrive. If you know what color you put in there, you know what color you’re going to, if it’s higher you can actually drive it higher than what you want it to be to compensate for that problem.
And we’ve done that on both PC and mobile but it’s really another extra expense that’s hard to justify on mobile relative to all of our other problems and things that we want to spend our resources on. But still in general, darker games are an improvement. But unlike the problem with PCs and console games in the classic Doom 3 problem where if you make a really dark game, at least in VR when you can block out almost all the outside light that can be a much much more pragmatic workable solution than it is for a typical AAA game, where you have to worry about ambient lighting in the living room and so on.
So that’s one of the takeaways for Gear VR is the super bright pastel colored worlds. While they play well in VR from at a higher refresh rate, they have a drawback that you have to deal with and work around a little bit at the 60 Hz side of things.
The big one that is harder to mitigate though is the lack of position tracking. I mean we do all know, we’ve been talking about presence and how important it is to get that sub-millimeter tracking accuracy and we just — we don’t have it or any analog on mobile right now. And there are things that we’ve taken early steps on this – well, can’t you use like the AR applications to use the outward facing camera? And I have done integrations with QUALCOMM’s Vuforia and tried different things with that.
But some of the things that people miss when — if you pay attention carefully to these slides, one of the things on there is submillimeter no jitter — submillimeter accuracy with no jitter. And the current things that people can do with outward facing cameras for absolute positioning are really not close to that.
So we have — there’s not a great mitigation strategy for this other than all the things that people would do on DK1 to just try to not have the problem by not having things in the real near field where your body is not interacting with them the way you want them to. But there’s no really killer strategy to avoid that unless you’ve just got everything in the distance field, no stereoscopy and no near field effects. So that’s something that we just kind of have to grit our teeth and live with for now.
Now I have a scheme for both of these. I have things that I think are workable solutions that involve changes of the architecture that we may address these in the future. Because one of the things about kind of hitching our train with Samsung here is that they – Samsung’s technology ticks twice a year. They have big product rollouts two times a year and we expect Gear VR to be kind of following this path. So it’s not like it’s going to be years between updates here. There’s going to be hardware changes and updates and we can look at rolling out major new features as it goes on.
So the paths that I think can address these problems are: for the low persistence display, right now we achieved 75 Hz on DK2. Now DK2, if you’ve ever taken one of them apart, it’s basically a Note 3 screen. You know, this is a Note 4 screen with Gear VR. So they are very similar.
And the question might be asked: Well, why can’t we just run the Note 4 screen at 75 Hz like DK2?
So there’s couple aspects to that. There’s two things that we do to get the refresh rate on DK2. One of them is pulling out all of the blanking lines at the end of the blank. It’s a weird archaeology of technology thing that even LCD and OLED panels they still have vertical blanking lines as if they are CRT waiting for the raster to go back up to the top. And this is just, as display technologies have evolved, they have kept these historical artifacts. So there’s a little bit of margin where you can take all of those out and we would’ve been able to – and we did have one of the Galaxy — one of our earlier prototypes we could run it at 70Hz.
So the question was: Well, is it okay to run at 70Hz? Is that worth the improvement there? And the problem is a lot of things that we were looking at are going to be media related where you have – you want to be able to playback something that was captured with cameras, which is going to be usually a 60 Hz input.
In fact, one of the big achievements I think over the last year was getting a lot of the work in panoramic photography focused on 60 Hz instead of just 30 Hz capture. But that would’ve been probably a net negative for a lot of these things if we made a 70 Hz display that would make — it would add beat frequencies to anything playback at 60 Hz and it would make all the normal games about 15% harder to hit that target frame rate and that was a significant concern for us.
And then the last 5% five frames per second that we get out of DK2 is actually by kind of overclocking it beyond what Samsung wants to do to the millions and millions of Note 4s that are going to ship out there. So I don’t think that even in the coming year — coming years that we’re going to see these standard displays that you’re just going to be able to turn the clock up to 90 or 120 Hz.
And in fact, while we’re talking about 90 Hz as the level where most people don’t see any flicker, a lot of people can still tell on DK2 that it’s flickering especially if you do the bad things, you put white at the outside edge of the screen and that’s where a lot of people in their peripheral vision will still be able to see it.
90 Hz is where probably 95-99% of the people really don’t see it. We still run across a few people that they can perceive it at in the Valve room or at one of our 90 Hz displays but that mostly solves it.
But I would argue that there’s still a strong reason to go beyond that all the way to 120 Hz and the reason for that 120 Hz gives you the even divisions where you can run 60 Hz content with double frames. You can 24 Hz content and even – or you can run full 120 Hz new frames.
But there is very little likelihood that we’re going to see these small displays actually running at 120 or even 90 Hz at the size they’re designed for. There’s a lot of inertia in the cell phone industry, small display industry and they don’t currently see a real need for doing that. Although I would argue and I am making this case to Samsung that they can probably get more perceptible benefit to the users right now by instead of revving the resolution even more, because if you put 1440 phone next to the 1080 phone, you have to look kind of carefully or interact with it for a long time to tell the difference in quality, and it’s at least arguable that if you went to a low persistence 90 Hz display, the scrolling test which everybody does when they get a new cell phone, you how fast and smooth does it scroll, is the Apple better or the Android system that they could get more of a perceptible win by doing that. So that’s one tag.
I mean we are at least saying there are benefits to doing this at the higher refresh rates. But the scheme that I am proposing that I think is doable, that I think there’s a real chance that we can get Samsung or some other display manufacturer if they want to step up on this, is the return of interlaced scanning where interlaced scanning and programmers are groaning to themselves right now probably about this.
But interlaced scanning was invented for a reason and back in the 40s and 50s when TV was developed because we knew that we needed at least 60 frames per second to avoid flicker. But they really could only get the resolution that was acceptable at about half that. So the ideas instead of scanning all the lines down one time after another which at 30 frames per second would flicker horribly, you instead go every other line and then every other line on the way back, which means that it’s kind of like low persistence issue which CRTs are even lower persistence than OLEDs.
They have nano seconds, that actually illumination depending on the phosphor there. But what that does is it gives you the same kind of fooling effect of low persistence, lets you imagine the motions that’s there but alternating the lines there, your brain will imagine that it’s a solid screen even though you’re only updating every other half of it.
Now this is something that I am hand-waving at Samsung saying, I don’t think this is going to be that hard. You probably still have little crafty bits in your LC, in your OLED controllers that are actually made for doing interlaced displays on NTSCs. I can’t believe this is going to be that hard. So if we can get that, then we can run a 60 Hz display wit 120 fields per second, So this is not like NTSC where you can remember the flicker of that. That’s a 60 Hz field and it’s not great but if you double that up and you are running 120 Hz fields, I think that can work.
And if I can get that bit and I think that’s a winnable battle. I think that, that can be done. There is limits to what we can try to influence Samsung on the display side because obviously they care about that hundreds of millions of cell phone displays, and however much we expect VR to be successful in the next couple years, it’s not in that level of units.
So we might be able to get them to do few minor things for it. But what I’ve learned from dealing with Samsung certainly through all the Gear VR side, on the software side is there is an icebreaker moment where as soon as they push back and push back until something breaks through and they do something and it works out really well based on what I’ve suggested, and after that, then the doors are open and we can get all sorts of things.
Because from what I think will be that icebreaker moment with two-way interlaced displays, there are two steps beyond that, that I think can provide really significant benefits.
The next one would be going to a deeper interlaced display. Instead of going every other line, start going every eight lines that you have 8 times as many fields, or every 16 for 16 times as many fields. That would give us the kilohertz display that we would really like to see where it would scan down and you would have kind of like a venetian blind effect of one image there and then 1 ms, at a kilohertz rate it would have the next one, and the next one probably going in some kind of interlaced pattern rather than a straight going across there.
So then we could have every IMU update, updating the entire visual field of view. We can do something like that now with sliced renderings, sliced time warps where we can divide the screen up into currently eight bands. And we can render that band just in front of a raster. And that’s how we can get down to this sub 4 millisecond motion to photons latency. But that has the issue of — it is motion-to-photons but you’re only seeing it in a band of your view and it moves around. And we still have questions about some of little things like when your eye does a fast contract from one side to the other, if it starts or ends in an area where there’s actually nothing in the display is that causing some subliminal issues for us. And I think that there would be this real benefit to having the entire screen always illuminated with some set of bands that’s just kind of moving around as it goes.
So that becomes Step 2 – a little bit more technology there — probably still doesn’t cost them a dime in terms of adding extra hardware to it. It’s just going to be changes sort of in the driver chip in the software that runs that. And I don’t know, I don’t have access to that level of — I don’t have insight into their exact processes. But I am guessing that that’s not that bad.
The third step from there, which should start arriving some really interesting benefits for us would be – if instead of saying, we’ve got maybe a programmable mode register that says, interlaced by 1 through 16 levels there. If instead we make it programmable where the next line that gets scanned of the display is actually determined by the first pixel in the scanline. So at that point I could say, well, I can just in software determine whether it’s a linear scan or if it interlaces in different ways, and I could simulate those two modes there.
So one more step beyond that would be – you add the ability to say, which scanline you turn off at it? Right now the rolling shutter low-persistence means it is always scanning a new things and then some amount back that’s programmable by Samsung, it turns those lines off with the bright band that moves through the screen. So it’s always like scanline whatever you’re on, minus 200 or something is turning off.
If we make programmable then and we have the ability to say, we’re scanning – we’re scanning in line 50 and returning off-line 20. Then that opens up the possibility of dynamically changing that every frame based on the content that we’re putting in there. So we could scan across in like a compute shader as we do the time warp we figure out what we’re going to put into the scanline, histogram all of that find at least the minimum and maximum, they do some fancier stuff there. But the dynamic range of what’s going in that scanline and then we normalize that to the full zero to 255 range. So it takes the full available precision that we can put into the bytes that are going out there.
And then we determine how long that line will stay visible. So it would be variable persistence based on the dynamic range of the content. This would give us two benefits. It would give us increased precision at the low end because you could have really far more precision than the human visual system could contain — could ascertain there, because we could have a zero to 255 level that might be turned off one scanline after which would be 1/2000th of the frame, you know, crazy low values there. So you get incredibly high precision going there.
But then you could also stretch the persistence all the way back up to the full level. Now full persistence is a blurry smeary mess like DK1. But if you’re normalized to this dark room or the dark head-mount display moving around and then a line comes on full persistence, it’s blinding. It’s like you do go out into the sun, and you could dynamically tone map all the stuff down to just the exposure so that – but you could get that sense of walking out of the darkroom into the sunlight, it’s all blinding and then you adapt back down.
High dynamic range displays would be a wonderful thing even for traditional displays. I mean, I have argued for a long time that high dynamic range, it’s a chicken and the egg problem for content right now where we don’t have high dynamic range film capture and movie rendering. So there’s not much of a call for high dynamic range home theater. But that would actually make more difference for people than a lot of what’s gone into – the higher frame rate super resolution or super frame-rate stuff, or higher and higher resolution. Higher dynamic range is pretty powerful.
Now this wouldn’t be perfect because if I’m looking at this and I got these bright lights here, every strip that has a spotlight there would be normalized up to really high persistence and then everything else would be low persistence. And I am waving my hands about this but I think that’s doable with no additional cost to the hardware and it would give us an element of both solving all of the other prior problems, giving us the interlaced scans that we can do whatever we want and giving us some element of higher dynamic range and higher precision on the low end.
So that’s my plan. I hope Samsung or anybody else listening to this can go and make that hardware happen. I don’t think it’s that hard, and I think you can make a difference for us.
Position tracking
Now position tracking is really having a harder problem. I think that the display stuff can be solved just like that. Oh, and the other great thing about that solution, because it is no additional bandwidth, it means that we follow the train to 4K displays without any interruptions, where right now every time they get a little bit more analog bandwidth on something to be able to charge the pixels faster, whatever they want to be able to make it higher resolution, because those are the numbers that people look at.
And when we talk about saying, well, we want 90 Hz displays, to some degree we have to make a choice between resolution using smaller chunks of screens that scan at a higher rate. But I think that the interlaced display may be the possibility that gets us state-of-the-art displays at all times without having to wait for them in some way, while also solving a lot of these other issues and avoiding the bandwidth problems.
Now I think we can prototype this where we have displays that we can reduce the line count on and overclock a bit. So what I’m trying to get some people internally to find some time to do is take some of our displays, making 120 Hz display. So basically take something that we can clock all the way up there whether by cutting lines off of it, or actually driving the clocks higher on hand selected parts. And if we’ve got that, then I can simulate exactly what an interlaced display will look like by sending 260 Hz fields that actually have black every other row in them. And I think if we can put that together, put it in a head-mounted display and say, here it is — here’s what 60 Hz regular looks like. Here’s what, say, 90 Hz or even 120 Hz look like, real full refresh rate. And then here’s what the interlaced display looks like. And I’m betting that it’s going to be at least as good as the 90 Hz. You know, it’s obviously always better to have all the pixels all the time and get a 120.
But if we can have 120, solve all the field division problems and the issues with pull-up, pull-down on the judder rates and get it with no additional bandwidth, it would be a huge win. So there’s some effort underway that I’m trying to get that work done, because again I think that will be really pretty important.
For the position tracking, I do think that’s the harder problem. And I’ve got a scheme for this to you but it’s even more hand wavy about – it may or may not work. I am confident that that interlaced stuff will work out.
So for position tracking, the people that have been doing this, like the Valve team and [Gavin] team here at Oculus, it is a significant challenging undertaking. And again when you come back to saying it’s not just like every little demo you’ve seen of somebody looking around and something seeing tracking dots on the world or whatever. But doing it with the sub-millimeter accuracy and without any judder, it’s hard.
The phones today really are powerful enough that we could run similar calculations, it would reduce battery life and runtime, thermal problems and all that. But it could in theory be run there if you had the same external camera. But really what we all want from this VR system is the thing that you’ve got on your head, that you can look around and you’re not tied to anything, whether it’s – no cameras, no external cables, anything like that. You wanted something done with the inside out tracking.
Now probably everybody has seen demos of the slam work where people have an inside out camera. And it looks like it’s working going around. But none of us have yet seen something that really works like that well in a completely unstructured environment at the refresh rates and precision, and things that we need for this. It may be that somebody’s done it but we keep getting pointed to things that people think will work and they’re just not quite there in terms of everything that we need.
So like with so many things, when you’re presented with a really hard problem, and the way to address it often is to figure out how to make it not such a hard problem, you know, basically how to cheat. And that’s not always looked at as — if you know — and we have some of this at Oculus where the fact that we know how to do it really right, we have the pure way of doing it. Arguing sometimes for something that’s not quite as pure is sometimes a little bit of a battle where – should we just wait until we can do the purely right thing?
But I think that there is a middle ground on this that can provide many of the benefits where the hard problem to be solved right now is a 6-degree of freedom problem. It’s finding where you are relative to a specific point. And this is sub-millimeter precision. It’s really hard. And you wind up using that — you coast on your IMU integration, everything in between that. But you have this continuous absolute sense of where you are relative to the camera and then relative to the world around you. And that’s a wonderful thing.
But what we need out of the position tracking is really not so much this absolute reference of the world. What we needed to do is be close enough to what you’re seeing from the headset’s point of view, not so much relative to the rest of the world.
Now what we’ve been doing with sensor fusion and the basic IMU stuff since DK1 is taking relative sensors, these great gyros — they don’t tell you which way you’re pointing. They tell you how fast you’re turning. And we integrate these and we do all the sensor fusion stuff to basically say, well, you’re not telling us where we’re pointed, which is what we really want. But we keep track of it, all the turns that we think that we’ve made, and we fuse it with some other sensor data. And we wind up with this estimate of where we’re pointing. And we can — we use accelerometers to correct tilt up to this for gravity. We use the magnetometer to correct york relative to this. But there is still approximations and you’re not within a couple degrees of get solid accurate with this. But it still works pretty well.
And I think it’s possible to do a similar thing with the position tracking where — if we don’t care about absolute knowledge saying that I am 57.2 mm from the camera and I’ve translated 2.7 mm since the previous update. And if we give up on that and say all that we really need to know is relative velocity, basically camera space relative velocity, that’s turning a 6-degree of freedom problem into a 3-degree of freedom problem. And it winds up being something that is potentially a whole lot smoother. So this is where I have a little bit of effort ongoing to this where I think that I can make this relative motion sensor and make that work conceivably on the mobile systems within our sort of power and graphics budgets and the things that we need to do there. But that’s a lot of hand waving.
I’m hoping to be able to have some kind of an existence proof on that within the next couple months. As soon as I can pull away from the cruncher, and actually shipping the first generation product and start working on what goes into kind of the generations after that. I have no guarantees that’s going to work out. There are fallback plans of their other sensor technologies that can be used. We have Qualcomm’s hard at work trying to make their technologies work better for this.
Early on, I did use Qualcomm’s Vuforia augmented reality platform to do a basic set of motion tracking on mobile. And that’s something that lots of people will probably be trying to do this. You can go get it, it’s kind of fun to integrate. It’s not close to working well enough both because of the accuracy, that there is jitter involved in there. And the basic way that, that works, the standard Android camera interface gives you this 30 Hz camera which is lots of smear and blur. So that was never going to work out well.
But we do have — one of the things that Samsung has done for us is given us a path to get 120 Hz camera updates, which does wonders for anything that you’re trying to do vision wise. 30 Hz is almost impossible with the amount of smear that it has. But at 120 Hz you’re looking at pretty good images that are pretty reasonable for analysis. So if Qualcomm or somebody else comes up with some magic that lets us work with that, there might the intermediate things like you could have one marker in front of you, it could even be like unfoldable placemat or something that might provide an absolute frame of reference. And if you move around and come back to it, it might work out okay.
So those are the two Achilles’ heels of mobile right now. The low persistence video at 60 Hz can cause flicker and the lack of position tracking can cause discomfort at depending on how you move around.
But that brings us to the things that are actually really awesome about the mobile experience. And some of these are things that you really don’t appreciate until you’ve really experienced them. You might not be thinking about them why it makes it so great. But like Brendan mentioned, the ability to have the headset and just hand it around to people to just say, here try this, rather than bringing someone into your VR layer and hooking them up with all the wires and everything. It’s like – it’s a different experience. You just can’t carry it around with you and show it to people.
That has the power to it that you don’t see that — the kind of casualness of it and the treating VR like you treat a tablet rather than like you treat this experience that you’re setting out to go undertake. So there’s an important aspect to things like that.
One of the obvious ones is the fact that with no wires on you, you can turn completely around 360° and you can keep turning. And this sounds like a really trivial thing but being able to play a game, like Brendan says, we never talk about standing, but if you can play a game on your — like on a swivel chair, you can sit there. And I mean you laugh but this is a bigger deal than it probably sounds like at the beginning, because one of the core comfort problems we’ve got with navigation, even with all the stuff Brendan talked about, all the magic that we’ve got now and everything even know how to do in the future, it doesn’t solve the yaw navigation problem. Where if you’re sitting there and you stick yaw, and the whole world spins around you it doesn’t matter if we nail everything else that is still sickness inducing. It is still one of the hardest things.
Few people can sit and play Half-Life 2 for a couple hours, playing the game like that, because it is hard. And yes, we will find some people that can steal their stomachs and push themselves through it there. But that is clearly not the comfortable experience that we’ve been talking about that we want – we want people to get in VR.
But if you want to explore the world, if you can do the turning in the real world whether by literally doing this, or spinning around in your chair, and all you do is go forward and backward. Yes, every change in velocity is a little lurch. It’s a little punch in the gut but the continuous yawing, that’s the really bad news part of it. So if you turn and all you do is go forward and backwards. And as I should be drilled into every developer now from all of our best practices, you don’t want to ramp up the speed, you want to jump to the speed, be at that speed and then jump out of that speed.
With that, you can explore the virtual world the way people always want to. And even we’ve been building all these things with seated experiences, or the things where you really don’t navigate around and you have neat stuff around you, and that’s all great. There’s a lot of things that we can do. But you know after the 50th turret shooter game, I think that the genre is going to get kind of tapped out, and we’re going to be wanting to do different things.
And even there are things like the theater, the magic of VR is seeing this place, believing it, then wanting to explore it. In our theater models, you see when the lights come up and you see down there, I want to go down and see what’s around the corner there. I want to explore the virtual world. I don’t want to just be the shell around me. So I am very excited about the possibility of people building games that involve navigation that take advantage of saying that, “Well, okay, we will still have a sticky yaw, so if you are sitting in your easy chair recliner and you can’t rotate around, then you’ll have to do it like this and it’s going to be uncomfortable.
But if you can sit in a swivel chair and move around, you can probably play for an extended period of time, explore the world and do the things that people have always really been wanting to in VR. So I think that it sounds like a trivial thing. But I think that like swivel chair, VR is going to be kind of a big deal for writing a lot of those experiences that people want to see.
Resolution of the Panel
So Gear VR, as it turned out right now, is pretty good. I hope everybody gets a chance to take a look at what we’re showing at the demo. Some of the things that I want to point out in particular – so the resolution of the panel is the big thing to sort of crow about on Gear VR where it is a 2560 x 1440 display on the Note 4. So this is significantly higher resolution than DK2. So it’s everybody that complains about resolution, it’s still not solved, like Michael said, we’re at this very very low end of the resolution. We need orders of magnitude more to really get to this retina class display. But doubling the resolution is a damn good thing. It helps a whole lot.
Most of the stuff in mobile, we render at still a fairly conservative eye texture target size. The default if nobody changes it is only 1K by 1K. So we render one megapixel per eye which is less than the number of pixels on the display. So they get stretched up and distorted on to it but that’s usually the right trade-off. You can render more but mobile runs out of juice on the GPU or it runs out of power earlier and it’s a good trade-off.
And in fact, one of the tragic mistakes that happens to a lot of Rift demos and games is through mis-calibration of the eye relief and using what the SDK returns to you, a lot of demos when we see that have tragic performance are because they’re rendering at some ridiculous resolution 1800 or 2K on some eye because if your eye is jammed that close that’s what the – the center pixel, the next pixel step would ask more. And that’s never the right thing for anybody. I’ve argued that we should have conservative defaults on the PC as well just to prevent that all-too common mistake from happening.
So the 1440 display doesn’t actually help all that much if you don’t change the size of the eye buffer. And we had – when we migrated from the 1080p to the 1440 display, the thing people commented about was better pixel fill. Even if there were no new pixel sent to it, it looked better just because the pixels were smaller and they filled out the area little bit more.
But we do have some applications that take full advantage of the display resolution. And the way this happens is instead of rendering to the eye buffer, then distorting that eye buffer onto the screen, there are certain class of things that I call overlay effects in the mobile SDK where we can have normal display that’s rendered at say 1K that’s stretched around. And then we can sample some analytically specified things and currently it’s either a quad in the world at anywhere in full perspective, or cube map surrounding it. And we can sample those textures directly which are – make them completely independent of the resolution used for the normal eye buffer.
So this means that the movie theater screens are rendered at the full resolution of the screen, which is still not what you want, because it’s at 2560 cut in half, each side is 1280 x 1440 and you don’t usually watch a movie theater at the full 90° field of view. It’s pulled in. So you wind up with — your movie screen resolution there might still only be 960 by something, which doesn’t sound good relative to high-end stuff. But it’s still a lot better than what you’ve seen in VR so far.
But kind of the more exciting for me application was the panoramic still – I think we call it pano photos or whatever it is in the current setup. But we’ve got — we have 200 photos that we license from 360 cities and one of the real eye-opening things for me this last year was seeing how much really amazingly beautiful panoramic photography there is out there.
Now everybody has looked at a pano photo on their computer or tablet, it’s just not a good experience. I mean you can swipe around and zoom in and it’s a gimmick that’s a little bit interesting to look at but it doesn’t really deliver anything. Looking at them in the VR space makes — it’s a completely different thing. I think it’s going to totally legitimize that entire field of photography in a way that it hasn’t been before.
But so for these particular images, we curated out 200 images. I resampled them to exactly the right resolution for this and then they are displayed with this time-warp overlay shader — which means that these are gone — these are drawn at just about the best possible display presentation that you could get from this.
So the resolution that I wound up using was 1536 x 1536 cube maps and that winds up being me rounding to a power of two — multiple of a power of two, for probably no good reason right now. You can take 1500 or 1600 cube caps as a side there. And that’s based on making about the middle 40% or so of the view, not have any mid-mapping and then fading to the mid-map as it goes out for the distortion there.
So in the center of the screen you’ve got very close to a one-to-one mapping from that image to what you’re displaying. And they look really good. There’s a lot of times when we’ve seen people that get into some of that and are just looking through all the photos and it is such a dramatically different experience than looking at those same photos on your computer. But it’s the best demonstration of the resolution of this device that we’ve got. So I hope everybody can take a little look at that and just see what — this is kind of that the peek display quality.
The other thing I am doing on there is it’s all [gamma] correct on the refiltering. So it’s sRGB set up which still costs you a little bit on mobile and usually isn’t worth that much. It’s — you kind of have to search for the best cases that show off why there is any value there. But in showing the static photos there, it does have a little bit of extra value and I wanted to just kind of nail this just about as perfectly as possible, as the example of – here is what we can deliver on the display.
So what you want to then imagine from that is saying, look at the panoramic videos — of the stereoscopic panoramic videos and say, well, these are at much lower resolutions for a number of reasons I’ll get into and where we want to be is to have that full quality of the static images but have that then be stereoscopic motion from there. And one of the things that they really made me think that I crossed an important threshold as we have – Seneca, one of our fantastic artists that work on all these things — is usually down on all the stuff. He’s not one of the magic of VR people there. And he’ll be complaining about the blur and the quality and filtering and he’d looked at a bunch of the panoramas and just like that kind of blurry, not that good-looking.
And finally when he said that — the final version of this overlay shader stuff with the presampled stuff, like yeah, that looks pretty good. I think there’s a threshold that had been crossed where it’s considered acceptable even if somebody that’s really picky and is probably going to be passed any necessary threshold for the general public there.
So the other things that we will see on – the other things you need to check out on Gear VR is the panoramic stereoscopic videos. And this was one of those things that there was — there were significant debate over whether such things should even be condoned internally let alone promoted. And the reasons for this are stereoscopic panoramas, whether still or videos, are absolutely a hack. There is — we know what right is and this isn’t right.
What they wind up doing is you’ve got slices taken from multiple cameras, so straight ahead it’s the proper stereo for a para-wise and then over here it’s proper for this. But that means that if you’re looking at what was right for the eyes over here but you are looking at out of the corner of your eye over here, it’s definitely not right. It’s not the right disparity for the eyes.
And then even worse if you turn your head like this, it gets all kind of bad, because it’s set up just for the eyes straight ahead. So this was an interesting thing. We’ve got the stuff where we basically know in some ways this can be poisoned, this can be a really bad experience if people you spend a lot of time slouched over like this.
But the argument is that people to watch 3D movies in theaters that have the same problem. If you turn your head sideways in a 3D movie you’re seeing the wrong things. It’s worse in VR when that consumes your entire field of view. But there’s still the argument that people do accept these partial steps. And this is a struggle inside Oculus, because again we know what right is. We know how to do right and we are working on doing right – is there value in taking intermediate steps and accepting something that we know isn’t right?
And the final conclusion was that – yes, this is one of the things that people reacted very powerfully to. And we started off with some early panoramic videos that we had gotten from different places. And one of my big successes this last year was again getting more people to do 60 frames per second panoramic captures rather than 30. Because a lot of stuff was based on GoPro rigs and you’ve got – there is all desync issues in different camera capture stuff. It’s a wild west right now in capture for panoramas, and I think that’s going to be exciting to see what happens over the next year, because this is going to be a pretty big thing.
But once you get to the 60 frames per second, you could have like the moving tours through things. You can have the helicopter rides and they look really good. The motion is great, low persistence and everything’s good there. But a normal panorama, like a normal panoramic photo, every things at infinity, there’s no sense of near field objects. And there’s a lot of wonderful artistic good-looking things like that but it’s not that sense of presence directly.
The stereoscopic panoramic videos under the right conditions – under the conditions of – you still have no sense of translation, you better not turn your head sideways. It’s best if you’re focusing on straight ahead but they can’t deliver that sense of something’s right there. And I was extremely impressed by the Felix and Paul videos that they’ve done, that Patrick Watson, Stranger. And so Oculus and Samsung contracted them to do the out-of-box experience video for Gear VR. And it has scenes that were shot in a few places and they worked with another company, do some CG scenes set up for this stereo panoramic video.
And it turns out to be really pretty, pretty impressive and that was what we want everybody to experience the first time they put it on their head. It’s got scenes that are coming from the real world, coming from synthetic things and they have that sense that you really are kind of there – the real world is still more complicated than anything we synthesize even on the most high-end systems in real time. So there are going to be large sets of things that even on these crazy high-end PC systems, we still cannot convincingly simulate. This room with all these hundreds of people here, this is not something that we can make a plausible real-world quality simulation on — even on the highest mGPU.
So there are all these things that people would like to be recording. And one of the things that really was one of my high points of this project here was seeing Felix and Paul had Cirque du Soleil, take some of their props for one of their shows and they built a little VR specific skit for this. And they had performers coming out coming on to your side, something going on in front of you and they are talking to you with positional audio. So you wind up looking over to them and this is something with it showing the various beginnings of the hints of the artistry that’s going to be brought to bear.
So I do think that this type of thing that the panoramic videos is going to be a big part of the Gear VR, eventually VR in general, because I think it can do a lot of things that we still can’t simulate.
So if we go back to – so this all started over a year ago, a little over a year ago. So when I first came to, I decided that I was going to go full time at Oculus. When I was first coming down to talk about this — we had — we knew everything that was going on with. We had HD DK at that time, we really didn’t – DK2 was not really even a gleam in terms of what it was going to be, because we didn’t have the deal with Samsung going yet, all these different things for the displays.
But one of the things that most excited me at that first meeting was learning that – so Brendan was going to Samsung all the time trying to — we want OLED displays. Everybody knew we wanted OLED displays, because initially just because it’s superfast pixel switching, no blurry switch. And then as everybody grew to understand the importance of low persistence from Valve’s work that became the display technology we wanted.
And the way Brendan told the story to me when I was first getting there was – you know, he is going there trying to get the buy-in on selling screens to Oculus. And he’s getting — not exactly getting everywhere that he wants to go with Samsung there. But then one of the Samsung people sort of pulled him inside and said, well, we’d also like to talk to you about this. And they pulled out the first of that little 3D printed phone holder that was something that they had been working on. And they decided that they wanted to work with Oculus on it.
So I got to see that first little demo that Samsung had put together. And it was what you’d expect, it’s a phone holder like cardboard or like VR to go before that or like a bunch of these other projects that’s been done. And it was a quick spark for me because I remember a couple years previously when I was first starting all this VR stuff, one of the things that I got was a little Hasbro drop-in iPhone holder. You know it’s like a little View-Master looking things, tick your thumbs up there and of course it was terrible.
But when I looked at that, I saw it doesn’t have to be terrible. There are all these straightforward things and I almost drafted up a Biggy Hasbro at the time about these are the things that you need to do to fix this experience, you know, it could be really cool. And I was doing that at that time with all sorts of companies.
So when I saw the Samsung project there and I looked at that. Of course their first demo was – well, their first demo showed why they came to Oculus about it. it was just not a compelling experience in any way but it had all those problems that I’ve seen in these aspects before it, because – but it doesn’t have to be like that, because what we’re looking at all the Rifts are really it’s a display that could be driven whether it’s by a PC or by Android GPU. And the sensors are similar between what we’ve got in the phones there. It should be much better.
And I looked at that and said, this is a challenge that I know what to do on this. And I think it can be really important. That was really one of the large motivators for me wanting to go all in with Oculus where I mean all the stuff at the high-level is interesting. But a lot of the action happens at the lower end of things.
I mean historically through all sorts of technologies there, it isn’t the bespoke super high-end thing that winds up changing the world, it’s the thing that everybody carries around with them and that everybody winds up having. And I’m dead serious about wanting to see a billion people with VR, and I think a lot of that’s going to come from sort of reading the low-cost parts bins there, from cell phone type technology, where it’s great to have a range. You want to be able to have — it saddens me that there’s not a greater range in computing now.
People talk about how the greatest phone that you can buy is one that probably most people have here irrespective of money. You can spend 10 times as much and get a significantly better device. It’s great that we’ve got high-end triple GeForce crazy platforms that we can run these incredible high-end experiences on. I wish there was even more that you could get, you can’t go out and buy silicon graphics infinite realities and spend a quarter million dollars like you could in the 90s back in the old days. The best stuff you could buy is gaming rigs. And that’s great.
But at the low end it’s still possible to do a lot of these things because the mobile phones are really quite powerful now. The Note 4 is at least arguably the most powerful cell phone. I am sure I can write a benchmark that could win relative to something else but they are all incredibly powerful and they are much more powerful than videogame systems from not that long ago. I mean, yes, you’ll keep hearing the marketing hype about when they’re as good as the current generation or previous generation of consoles, and that’s usually overblown for some specific technical reasons that I’ll get into a bit later. But in any case they can run some pretty good experiences and it is an insult to the games of 15 years ago to say that you can’t do anything great unless you’ve got all the power of a modern PC. It’s just not true.
So I was excited about this project. We undertook to go work on this. But there was at the beginning this sense that okay, yeah, we are hedging our bets a little bit on this experience. We can’t do really great games, maybe it needs to be a more media focused experience there that we need to target these things.
And the other logic behind that was anytime you’re launching a new platform, you need to think about how much — what are people going to be able to do after they play your out-of-the-box demo and whatever you include with it, what do they do on day 2, and day 3, you need to have some way to have enough bulk in the catalog there.
So there was a lot of focus early on about movie viewing and looking at other ways that we can take advantage of existing data sets. We had a music visualizer but we’ve got a couple other partners now that are working on music visualizers. We kind of dropped that work. We have a great arcade emulator that we struggled through a whole bunch of licensing stuff, but I think it’s all worked out now. So work’s going on — ongoing about that. But ways to use existing things while the VR ecosystem can develop sort of the from scratch games.
So there was, from the beginning, this little sense that it was going to be a little more media focused than what we’ve done on the PC. But that drove us into a lot of research on the video playbacks and different things like that. But the core thing, once we took this, not very great experience and anybody you can go get essentially the same experience where — really Google cardboard is better than what the first cut of the stuff from Samsung was like. And by the way I think Google cardboard is a wonderful thing. I think that it’s been great for Oculus and for VR, for Google to have done that, where they — I think it was a very wise choice position — is actually make it out of cardboard, so people will not judge it too harshly.
And we fret about this all the time at Oculus about when can we stop calling things developer kits or innovator additions because people will judge us as a consumer product. And it’s a common topic there. I thought cardboard was like a brilliant way to sidestep all of that and not have people judge Google too harshly for what it was. But I think it’s a great thing, and it shows what you can you just dropping the cell phone in there. But that’s using the cell phone sensors and it’s going to the standard Android systems.
So identifying the key reasons – why is that not as good as what you see on DK2 or even DK1 for that matter, and that was what I really spent the last year identifying what those problems are, then figuring out how to get them — it was me alone in my carriage house, doing all of this work in my home office there. And then finally six months later, we pulled more people onto the project and we started kind of professionalizing the effort.
But the one thing before I got there, Oculus had done – they made a little printed housing and they glued on to DK1 sensor there and went through and got the idea that going to the USB subsystems. And — the calibration because essentially internally some cell phones use the exact same sensor, they’re just really not calibrated and find the same sensor inside some phones. I don’t know exactly whether Samsung had those ones but generally in Android you’re limited to 100 Hz update, which adds like jitter of up to 10 ms before you get it from – if you have an impulse right at the beginning of that, and the better calibration keeps it from drifting off too much but that still wasn’t really the killer –
The biggest thing once I started looking at the problem and poking at Android a little bit and this was all brand-new to me where I had taken a brief look at Android, maybe three or four years ago when I was doing iOS development and looked at that, like, oh, that’s going to be a mess all the fragmentation and I backed away from it. So this was my first real working with Android.
And I’ve got a lot of negative things to say about some of the Android development experience. But some of them are fundamental. They are having a market that is as large and fragmented as what they are and having to support all that makes it hard. The reason this project was kind of nice because we had a tight relationship with Samsung — might as well be a console in terms of targeting one specific device there. So that’s a really good thing.
But after writing my first rotating triangle in Android and starting to make timings about that, it became clear that the biggest problem on the latency was that Android by default triple buffers all the graphics. And so you render something either pushing all the frames out — milliseconds behind if you’re rendering something really simple. So that was a huge problem.
We’ve got this 20 ms bogey that we want to be under. We want to be under 20 ms on motions-to-photons and here you are saying, well it can take up to 48 ms before your pixel even gets to the display scan out for getting the things before and after.
So Samsung suggests on there is like, well we can probably hack things to turn that into double buffering and that can save one frame off. But that’s still not good enough.
So what I was pushing for is saying what I want is give me the front buffer, don’t swap at all, just say this is the buffer that’s being scanned out. One of the reasons that this hardware composer layer where it’s kind of like it allows you to have an image that might be changing in memory overlaid with other things. So it doesn’t completely break the experience and all of the Android unusable there.
So I was arguing I want this front buffer support, I will time it myself, all of these displays, the cell phone display VR, our portrait modes and we turn them sideways. Now unfortunately GEAR VR scans left to right and DK2 scans right to left which I am sure is going to cause some low on that people from back in the 80s, if you’re headed change your color pallets or whatever but this is a mode of game programming or did why the problem exist, this is why the solution will solve this problem, this is why your proposed solution will not solve it as well, can we please do this and it was a brick wall for a like over and over.
And when we finally got Max on board, you know, he heard me just going on about this is like why can’t we get this, now how some of these things work. And I tried a few really really ugly things. It allowed me to demonstrate, then I had a device where I could say here, look at – when it’s trying to convince you to do something you really don’t want to do for whatever — the ice was broken and Samsung made a nice interface for it that worked across all the other phones.
And after that, Samsung got me crazy things, I mean things I would never have thought that I would’ve gotten out of Samsung before that. So there’s an interesting aspect to this though is that hack that I did it, didn’t work on any of the other phones. It only worked on that phone. So I’ve got this little creeping fear that like what would’ve happened if I hadn’t been using that phone, if I had never been able to make that ugly hack that would have convinced them, I think Gear VR would’ve been a very different product right now because all of these cool things that I got happening might not have happened at all.
So I think I’ve got that same situation now with Samsung hardware where there are completely separate divisions here. So the software side, they now believe in my judgment on these things and they will go and do useful things. You know, the hardware team doesn’t yet. So I’m hoping that I’ve got that, that, icebreakers are going to be that interlaced display, that if I can get them, give me two-way interlaced and it’s going to win big.
And the other argument that I make for them is like, okay, so all these people you don’t care anything about VR — the other thing you can do with that is all of your cameras are capturing now at 120 or 240 frames per second, and right now you just use that for slow-motion. But if you make interlaced low persistence display you can playback 120 Hz for like hyper-real playback that you can’t get on any other platforms. That’s one of my other wedges that I’m trying to get in there. And I think it’s going to be the same thing.
I get the one breakthrough and then I can convince them to do the other crazy stuff with variable persistence and dynamic scanning and all that. So that’s my experience with working with Samsung through this stuff, where there was the feeling out period where there’s a lot of skepticism and reticence to change some of these things, because I mean they are going to maintain this across all their phones regardless of whether they are VR-based. And there’s reasons why they don’t want to go do crazy different things in there.
But once the ice is broken and everybody believed in the product and both inside Oculus and insight Samsung there was this period where there were skeptics that didn’t think this was ever going to work out well, people that really weren’t behind the project. But as it all came together and it became clear that yeah, it actually is going to work out, it actually is going to be good. Then it’s a positive cycle there where you wind up saying, well more people get buy-in, you get more stuff done and makes it even better. So again in the end I think it turned out and it was really a pretty good thing.
So the next major challenge and the thing that happened in the big education process with Android was the CPU clocks and power management where – so I started off, I got the front buffer stuff. I’ve got low latency here and now I am trying to make some applications for this.
So I start off working with basically Tuscany, the Oculus World Demo and I am going through, making it faster, doing all the normal — the stuff I’ve been doing for decades. Here’s how you just make graphics faster, you set up the patches, and you operate more efficiently, doing all this. And I was making some panoramic video players and all different stuff.
But what I was seeing when I’m syncing this stuff to the front buffer myself is it would still be kind of glitch, it would still have these regular glitches and it just wouldn’t be as stable as I wanted. And it was so frustrating because I would go and optimize the hell out of something, I’d make it a lot faster, I’d cut out triangles like cold things, make the CPU and GPU and everything should get faster and then it still winds up hitching occasionally. And I’d be looking at some of these videos that were playing back in 57 frames per second. You know, what’s the problem with this?
And then I found by accident one time that if I was resting my finger on the display screen, it playback perfectly smoothly, like what the hell – and there was even a point there where we had a little discussion about how we could make a good demo. Palmer mentioned that if you put a little lithium-ion battery cell on capacitive touchscreen that registers as a touch. So like, maybe we will make one of our little drop in headsets with battery right there to always get onto this.
But finally I got someone at Google to educate me a little bit about what’s going on here. And this is all of the dynamic frequency and power management stuff that goes on with Android. So power is a big big deal in mobile, and determine your battery life and that’s one of the big things that people make phone buying decisions based on.
So they go to great lengths to try to minimize the amount of power that they burn, and one of the tools that they have – I mean the biggest tool on saving power is always do less work, write more code. But you can’t — there’s millions and millions of lines of code and lot of it’s java going through all the Android systems. So there’s not — it’s hard to just say optimize all of this stuff.
But one of the things that they can do is they monitor how much CPU you’re using and they also do this on the GPU. And if it looks like you’re not using most of the GPU, they can turn the clock down. So one of the aspects of the way all these systems on chips work is that they’ve got a maximum clock rate and if you – the lower you can clock them for the same amount of work, the more power efficient it will be. If you’ve got something which takes 1 billion operations a second and you clock it at 1 GHz it will use about 25% less power than if it was clocked at 2 GHs and slept half the time.
I mean in an ideal world you have your maximum clock rate, you sleep when you’re not busy and it all works out but the reality is that you could save power, going fast costs power is why we’ve got the big fans and heatsinks on the high-end PC rigs.
So what was happening is that I’m timing everything now for drawing to the front buffer, which means I worked out what the decent timing is and I sleep until a particular point right before wherever it is on the screen that I want to draw. So I would be doing some work and then I would sleep for a while. So it’s watching activity, it is watching and saying, well, you know, you’re sleeping X percent of the time, that means I can turn the clock down a little bit.
But what that means is as soon as you miss a little bit, so I’m going 60 frames per second, they are turning the clock down, something hiccups, and I drop over and now I drop to 30 frames per second, which meant that I am sleeping a whole lot more, so then it says, oh, you’re using even less of the CPU. So I can turn the clock down even more. So it winds up being just too slow processing. And that was a big problem because it’s something like how do you optimize for this, when the better you optimize, more the hamstring you for it.
So one of the other really big deal concessions from Samsung was the ability to let us lock clock rates on Gear VR. Now this is something — there were subtleties to this where the first cut that we had on it was basically setting a minimum clock rate. So it would set, you could say, well I want a minimum of 1.2 GHz but they would still dynamically watch this and ramp up.
Now when we got that out to some developers and we’d say okay you want to try to make yourself work within the lower clock rates, it was hard to tell whether you were working there because if you were — a Unity app that’s using pretty much all of two cores to run all this stuff, and you say, well, I am going to try running at 1 GHz, you set up there, it starts out at 1 GHz, it hitches for a little bit and then it’s brought the clocks all the way back up to 2 GHz or something there, and you’re burning all that power.
What we finally did settle on in the latest builds and most even the developers that have been working on this don’t have – the latest builds and everything going into it is a real locked clock rate, where you set this and it’s just like having an old computer at a fixed specific clock rate.
But the other aspect of this that was not obvious at the time was the importance of thermal management, where on PCs again you’ve got fans and heatsinks and you can burn hundreds of watts doing all of this. On mobile, you’re burning just a couple watts but most of those — a lot of that energy winds up being waste heat that stays in the handset.
Now there are a couple aspects to this. If you get things hot enough you probably felt the phone that’s kind of warm to the touch. There are all the policy things about what, how hot is too hot. You’re not actually going to burn yourself but it can get — they can be pretty warm. And there are internal battles in Samsung about exactly what is acceptable there — the health and safety people will say some small number, VR people will say we need some much higher number because if you pick that small number you’re just going to clock down almost immediately.
Because what happens is when you use one of these processors and you actually load it, you draw a lot of graphics, you use a lot of CPU. It’s soaking up these watts and it’s heating up relatively quickly. And it will eventually hit a thermal limiter where it will start saying, clocking, you know, making reducing the clocks.
So if you’ve built your game, you’ve optimized it for specific clock rate and it starts clocking it down, well you can expect that VR experience is going to get terrible, more so than in a traditional game where dropping a frame is not that big of a deal, in VR it means we wind up with judder and we have problems with all this, and it’s a real big deal.
So what we have now is the ability to choose a balance. You get to pick your balance between CPU and GPU power. So this is actually pretty nice where you have a matrix of 4 x 4 values where you can say, I need a lot of CPU but not necessarily much GPU. And there is about a factor of two difference between this and it runs something from a little under 1 GHz to 1.8 GHz on the CPU and I think we go from 240 to 600 on the GPU. And you’re not even allowed to choose MAX in both.
Now the device can run that way, it just won’t run that way for long. I think that if you look at the specs on these devices, it’s like 2.5 gigahertz four core processors, with a 600 MHz GPU. Clocks or clock, they are not as good as an Intel processor and some of the other ARM ones are significantly worse than that but still it’s pretty powerful.
But if you spin up a fourth rate program with the GPU running there and you turn on the camera and Wi-Fi and decode a video in the background, you’re over-heating like 20 seconds or something. It will just – it can suck all that power and it overheats and shut things down. So it’s a real issue and this is something that we’re still not exactly sure what our messaging is going to be on this, where some applications, cinema obviously has to be able to run steady-state for two hours. You need to be able to watch a full-length movie in it, which means that it uses hardly any CPU, it’s on the minimum CPU setting. It’s not calculating anything, it’s decoding a video that’s going up there.
But it does come into – it’s got some reasonably heavyweight graphics going on, it’s using the overlay Playtech plane technology for — to get the super resolution on the video. It’s doing dynamic lighting on all the world around it. And that becomes a little dicey. We find that, okay we can just barely fit within 380 MHz level there.
But we do wind up now having a floating screen option where you just can have the screen floating in black, which saves a ton of power but also allows you do things like watch a video on the ceiling or something or lay down sideways and something that — it’s worth noting that some people say, can we just rotate the theater or something there — and that was one of those clear misunderstanding about the whole VR experience, where if you have your up vector in the virtual world and it’s not also your physical up vector that’s breaking what we want to try to accomplish in VR.
And so this balance is going to be an important thing for developers to be finding and in some cases it may mean making concessions on your graphics but there’s probably going to be a space for games that are really pushing it that will after 10 minutes or something say, we don’t know the messaging on this exactly yet. Samsung really doesn’t want us to say, like thermal throttle or thermal overload. I want a little cell phone on fire icon or something. But we’re not sure exactly how we’re going the messages, but one of the things that we can do with it is give you the option of continuing or playing in a degraded fashion or exiting VR and going into — going on to something else.
So something that we have the option of doing here on mobile that we don’t on PC yet and I’m pushing hard on this because this is probably the most important result coming out of the mobile environment was this — the asynchronous time warp notion where it was one of the very first things that I tried when I started on this project. The ideas that if we could decouple the refreshing of the screen which we know has to be 60 or 90 or 120 or one of these very high numbers from the rendering of the eye buffers that go into that, that would be a huge benefit to developers. It would allow you to, in theory, have a 30 Hz game that gets updated to the screen 60 times a second.
So we know that the basic time warp principle of – you can take an image, you can warp it to another place in space, and for attitude only just meant that works wonderfully, although there were lots of arguments internally where again – you made arguments by showing an existence proof, I mean there were a lot of arguments internally at Oculus about what the downsides of this was going to be. But just went ahead and did it all in mobile and we can look at this and say, yes, this turned out to be a good thing. And so we have interpolated time warp on the PC now. But the important point here is working towards asynchronous time warp.
So I tried it first on mobile just — with the obvious way you spin up a second thread, you pass off buffers between them and it didn’t work for the equally obvious reasons and expected reason that GPUs are all about throughput right now. They really are not optimized for latency. They are optimized for soaking up this immense amount of work and then processing it at some point in the future.
So the batching that happens here is not very well controlled. To have this asynchronous stuff work, I have got this window, half of a screen while it’s going on a front buffer 8 ms. So I need to know that I can draw that half of the screen within 8 ms of the time that I start. And there is a bunch of things that go in eating up that cushion. There’s the scheduling on the CPU where – if the CPU wakes up 10 milliseconds later, doesn’t matter if your GPU is infinitely fast.
But it also means that when I put in that little bit of drawing, you know, draw these 1000 triangles with a slice of it or whatever, that needs to happen before that window runs out. And the way most GPUs work now they love to have 20 or 30 ms of work buffered up. That’s the way they get their best benchmark scores in the way things look really good, if you want to render out to 34K displays or whatever, you want the bandwidth optimized for those situations.
But in VR we have these very very tight latency optimization requirements. So what needed to happen was that you need to be able to prioritize the work in some way and say, this is much more important than this. And GPU, when this comes along you need to stop doing this and do this instead and come back to it.
Now the first phone that I had working on here was in the Imagination Technologies phone and they actually have an extension for this. It’s like IMG context priority on thinking, well, this is exactly what I want. There was a minor fault there was that – in their particular implementation you couldn’t have multiple contexts in the same process with different priorities. So I would’ve had to do something really gross about spinning up another android process and passing — somehow passing the images as external images between the processes. And it would have been basically like Android’s composite or surface flicker and I did not have nearly enough confidence in my Android skills at this time — to want to spin that up.
But as we were going through this whole development process, by this time we had some good-looking stuff. We had the locked clocks. We had sched FIFO real-time scheduling which was another huge concession from Samsung to get in there. We had our kernel driver for the Oculus Sensor rather than winding its way through the standard USB input stack. And these were giving us this low — we were well under 20 ms of motion-to-photons latency and it was looking great there.
But the problem was, everybody was looking and saying, okay, John, your three applications run 60 frames per second reliably like this. What about all the rest of our developers? What about that all the — 90% of the developers were Unity developers? And where Unity is just not working that well under these conditions, because rendering to the front buffer there, low persistence, any little glitch on garbage collection, anything that just happens to load slightly incorrectly. And in the multimillion line codebase like that there’s just a lot that can go wrong.
And unfortunately like the way the SDK rollout on this stuff had gone with developers, when I reached a point where I had kind of the few native applications and me and couple of the other guys had been working on, and we were making those available source code, we made the first — I don’t even want to call it an SDK release – but the first batch of developers that got some code from me, got sort of our sample code for these things. They could do a few things but the Unity integration wasn’t ready yet.
And we really kind of mismanaged that first rollout to developers where – you know, it was still a matter of the whole company hadn’t really kind of gotten on board with all this. So we wound up with a dozen or so of our most hard-core kind of Oculus Connect sorts of people doing the development for it, that were largely the Unity and the Unity-based indies.
And so that first rep kind of went out and nobody was like, well, we’re not doing native code. This kind of just sat there for a while until I got the Unity integration working well enough. And again the latency worked out fine there. It was the variability that was the problem. We could make some Unity-based experiences that could be almost consistently 60 frames per second but it was hard to really make them absolutely reliable.
And there was again real doubt about, well, this may not be a viable platform for gaming. We can do our videos. We can do our hand-coded native things. But it’s probably not a Unity gaming platform.
So the thing that allowed that to actually come to fruition and become a decent place to develop most of the games that we are showing here are Unity games. There’s only a handful of them that are native experiences. And this really does owe itself largely to this asynchronous time warp-ability.
And Qualcomm was the company that really stepped up and implemented this where we had talked about it early on, as I said there’s two things that I wanted from the GPU vendors really. I wanted this context priority, so I could do asynchronous time warp and then I wanted a way to replicate effort between the two eyes to do stereoscopy with one pass through the driver. I’m already past my time but maybe I will get to that later.
So Qualcomm basically went in, they implemented Imagination Technologies extension there but without the limitation of needing to be in multiple processes. So that allowed me to then run the time warp thread on its own GL context and it’s synced to this 120 times a second left eye, right eye, going back and forth, running at the highest scheduling priority of any of the threads there. So it gets in, it gets its — within one milliseconds scheduling, it batches up the graphics for that and at some level in their driver it interrupts something else and gets there.
Now in the ideal world that would be perfect and we would never ever see a terror judder. But there’s a couple things that make this not quite perfect. There is a few things we know about and there’s a few things we still don’t completely understand.
But the basic thing is it does work. It works as a win, where I tried to tell — people hear about this and they think oh, it won’t matter what I render at. And that’s not true. You’ve got, the lower your frame rate when you got things near you, they will feel jerky. But it’s the jerky feel of a computer game that doesn’t – that needs an upgrade to the GPU. It’s not the unbearable hell of jerky judder in DK2 where with the low persistence display where just everything is killing you with it.
And the truth is for a lot of things can drop a frame and not even notice that it happened. So I tried to tell developers, this is frame rate insurance. This is not magic where you don’t have to worry about performance but work hard, try to get to 60 frames per second, try to do your best to stay there. But if you slip up a little bit or if something else in the system glitches a little bit, then we can handle that and we can make a new frame kind of magically appear there. And it does still work where you can say it is a 30 Hz experience and we’re going to double every other frame. It’s definitely not as good, for certain things it can be okay and that’s one of the things we’re still debating for – when you run into the thermal limits, do we offer this option of just saying, well, we can knock the experience down to 30 frames per second and that should run steady-state at the lowest clock settings there. We’re still not — we need more test cases to see how, whether that works out well or poorly for the different things. But still it was a huge huge win to get that asynchronous time warp working and that’s, I think, one of the big achievements over the last year.
So we’re trying to get that on to the PC for a lot of these things because as we start looking at 690 Hz displays, and 75 for DK2, these are hard to hit even on the PC with all that horsepower. And I said it many times before and people don’t like to hear it. But the PC costs about a factor of two in sort of the overhead that you wind up incurring, when you’re dealing with latency, not with bandwidth, the raw throughput if you queue it all up, you got scats there. But if you’re trying to do these very tight timing things, like rendering to the front buffer, never dropping a frame, switching between there, there are challenges on the PC.
So we are deeply engaged with Nvidia and we expect – there is a whole matrix that we’re going to have to work through Nvidia, AMD, Intel and then PC, MAC, Linux, that’s a lot of work to get some of this really twitchy stuff working. The work is ongoing right now and there are varying levels of optimism and despair amongst different people in the company about how well this is actually going to work out on the PC.
But I keep thinking it’s the same thing, though, we need the existence proof, what we’re probably going to wind up doing on this, if the magic doesn’t happen in the coming days, is something that I actually tried on the cell phone first, was doing the time warp – asynchronous time warp on a CPU core, burn an entire CPU course, which is just doing texture mapping like a horribly inefficient GPU and have it scheduled at the CPU level where we don’t have this massive backlog of buffers going. And that actually worked on the cell phone, but it was one of those things that in a minute it overheated the cell phone. That was where it was just totally totally over driving everything.
But on a PC we could actually do that. There’s plenty of Quad Core systems that really do have a core idol and you can’t then do the really fancy stuff like multiple composited overlay planes with deghosting going on. But you can do a chromatic aberration corrected time warp that can render right ahead of the raster.
And once you start doing that, instead of just left eye, right eye, what we can do — we can turn this on even in the mobile, where you can go to the sliced rendering, where you cut each eye up into four slices. And then you can say, Well I’m going to render just the slice ahead of me instead of the whole eyes. So I’ve had Gear VR running actual applications with 4 ms motion-to-photons time, where you’ve got a two millisecond windows and USB overheads, some scan-out latency, but that doesn’t work quite as well because of these — the other thing, the difference between theory and reality here is that having the high priority context extension doesn’t mean that it happens right when we do it. There are still all these layers of buffering that have to happen. Some of them are kind of Qualcomm specific and where we have painful choices about how we can turn this one thing down but it doubles the CPU overhead and command processing for things. And that’s right — that’s where we’re at right now, where if you spent a lot of time in the applications that we got for demos, you’ll see some of them that occasionally have a little glitch, a judder or a tearline on the screen.
And in fact, there’s two different types of judders that you can notice if you’re watching carefully. If you see tearline mid-eye that’s because we somehow overshot our graphics cushion, the schedule which is a combination of both the CPU cushion and the GPU cushion. And the problem with that is unlike Imagination or Molly technology where there are really tiny tiles like 16 x 16 or something, the Qualcomm Adreno has fairly large tiles. It’s a pretty good-sized block of memory on chip and it might be a 9th or a 12th of the screen of the eye buffer or something. So it can be a decent sized chunk and they have to have at least a couple of these in-flight to have decent efficiency. And then there’s the question of how often they pass them off from the client application to their driver.
So there is all these trade-off questions. So you can wind up with some things like there’s some Unity applications that has a pretty heavyweight graphics rendering, and if they really throw a lot in it, it all happens to wind up being in one tile. So we can wind up being some number of tiles behind and if it turns out that really more than doubled the amount of time we’d like you to have in there, we can still wind up blowing past that particular margin, and that happens on some apps right now.
The other one is an interesting thing that took — that we only got to the bottom of recently. And this is something that people that aren’t really involved with embedded display technology are probably already even aware of. But one of the new tricks that these ever persistent people saving power on cell phones are doing is that the displays now are often smart — meaning that they are not just blindly [dimming] pixels for memory and then putting them the charge cells or whatever on the display, they often now have a complete embedded frame buffer on the display.
So the idea is that a lot of time you’re looking at your cell phone, there is really not animation going on. And if they know from their compositor that, hey we changed something. Let’s push one frame down and then we don’t touch it for seconds later, because the main memory is pretty expensive there. So that’s an issue.
But what that means is that even though I’m rendering to this front buffer, the display controller might not be being told every time that hey, you still need to render to this. It’s something that one aspect of the Android system software, normally it’s managing the compositor and doing all this complicated stuff. But still for the VR apps, at least once per frame it needs to wake up and say, hey, you need to get the next frame.
And if it ever doesn’t get that, then we miss an entire frame and you see kind of like a DK2 stop swap judder on things. And this is really unfortunate because we have our applications at these high real-time scheduling priorities but Samsung is really scared like — it’s probably not a good idea to put some core part of Android at a scheduling priority, work it at priority inversion problems and different things like that. So we really want them to switch it back into a video mode but they’re all like we’re locked for shipping. This is a scary change, we don’t want to do this right now. So it may be something that we live with for a while. But we all understand the issue now and I am confident in future versions that that problem will go away.
But there’s still a few random unexplained things that we can see, some things I think dealing with audio that can cause us to glitch up. And this is something that we wish we had better yet visibility into the Android systems but there’s still a learning process again. A year ago, I went into this knowing nothing about Android and I am still far from an expert but we know our way around a lot of this now. And we’re getting what we need to accomplish. So I have pretty far shot over my time here. There’s lots more to talk about but I’ve got at least a little bit of time for some questions here. But I’m around all day and I expect to completely lose my voice today.
So I have a couple times already, just catch me anywhere, we can form a little crowd and we can go over whatever topic you want. But if we can go ahead and — I’ve got a few minute — we can probably — pass the budget a little bit longer but we’ve got a little bit of time for questions if you want to start lining up for the microphone.
Question-and-answer Session
Audience: [Question Inaudible]
John Carmack: So brightness — the question about whether we want to have some over-brighting factor for the interlace like you would do on to compensate for it on TV sets. Brightness is an interesting question. In fact, to talk a little bit more about the actual display technologies there. Currently Gear VR has an option of brightness slider and I’m not even sure that’s really a good idea. One of the aspects is that our low persistence value is related to our brightness levels. When you’re below a certain brightness, you’re at 20% duty cycle and when you’re above that brightness, you’re at 30% duty cycle. And I wish we were even lower because 20% duty cycle in 16 ms, that’s 3.2 ms and DK2 runs lower than that. And when you get all the way up to 4.8 ms, it’s getting to feel smeary. Again it’s still three times better than DK1 or full persistence display, but it’s not great.
So I would argue that – and especially since if you’ve got all light blocked out from the display, I don’t think you need a brightness selection. I think everybody sees the same things and you adapt to it fairly quickly. So I think a fixed display brightness probably would be a good thing and if we control it directly, then we could pump up whatever we needed for doing that. But I don’t think we would actually adjust it for interlace. I think that we would pick one of the values there and we would take whatever it gets.
But the other aspect related to displays there is the color calibration, where the displays are — our color calibrated, okay, I am going to go on for a while about a few more of these display topics. So one of the things talking about resolution, these panels are diamond pixel or pen tiled displays, which is generally not what you want. You want RGB triad, you want to say this is my red, this is my green, this is my blue. But these panels they wind up, all the greens are there but only half of the reds and half of the blues. When I first hear about this, each time I think oh, they are just cheating. They want to be able to claim 1080p or 1440p display without actually having all of those pixels. But it turns out there is another more significant reason, especially with OLEDs – the blue OLEDs will burn out quicker if they’re the same size, so making them larger and driving the lower is a positive thing there.
But this does mean that for the people, for the graphic texts they get really high end excessive about filter kernels and things like that. You have to know that whatever you push out on red and blue is not going to be what winds up on the displays; it’s going to be a function of 2 to 4 of the surrounding pixels there. I was pleasantly surprised to find out they actually are gamma correct. I just assumed that it was going to be busted and done in linear space but it’s gamma correct and the calibration on them is actually – it’s much better sRGB calibration than on my desktop monitor, there’s still little bit of room for improvement, the top saturates little bit more and the bottom ones aren’t quite right. And in VR the linear ramp on sRGB is a terrible thing that we might be better off not being sRGB but we’ve got good quality on a lot of that stuff.
But the other little thing that I would mention to game designers and interface side of that is there is a consequence of that fact that we’ve got twice as many greens as reds or blues. If you really care about information and conveying something like text, there is a good argument to be made for doing it green on black, pick a matrix style there. And – or it’s like [80’s Apple to monitor] where that will actually be the crispest, most legible way of doing things. And one of the clever things that our UI guys puked at the idea of green displays, so they at least came up with the kind of clever idea of having a very subtle green outline around things. So it still gives kind of the crisp out — crisply defined outlines with a little hint of green around it. But just going green on black would be the way to really convey the absolute most information.
Audience Male: Hi John, thanks for coming out there. I know it’s a long slide. So you said Brendan likes to think about the sit-down experience – sit-down VR experience, well, I am more about the stand-up experience, you know, full-body VR. I know the data you get from this is pretty coarse but do you think that there’s any value in taking a data from, say, GPS or a pedometer to track movement, like walking around? Is Oculus even thinking about that?
John Carmack:So the thing that keeps coming up that gets overlooked so much is sub-millimeter precision for the tracking. So the question of whether we can use larger scale things, whether it’s – although some GPS is amazing technology, I mean some of the real-time kinematic GPS that is getting millimeter level accuracy from things, usually not in real time, but that is startlingly close — getting the please wrap up angry message here. Let’s try and – it’s clear that the people that are at the microphones here, so no, I don’t think we’ve got any near-term plans for that. In the long term in our world of VR and AR, then yes, that will be used at some point.
Audience Male: Just related to movement tracking. You know, like Google’s Project Tango, have you guys got –
John Carmack: Yeah, we got – we had the Tango demo and everything.
Audience Male: Yeah, so like how accurate was that in terms of –
John Carmack: So it was better than a lot of people expected and yeah, you could look around and it’s not sub-millimeter but it’s like a heck of a lot better than what most AR toolkit type things are going on. And they have the added – they’ve got their structured light sensor that they could use for extra fixative – I understand that they’re still doing most of it through the standard camera experience though. So it’s great to see all of this work coming on and I’ll be happy — if somebody else nails this problem through cell phone cameras, I’ll be thrilled. That’d be one last thing that I have to kind of work on different things.
Audience Male: I do know you talked – what the latency and position –
John Carmack: I don’t know what it was, but that’s actually not that important. That’s something that’s not immediately obvious but all of these systems, all of the VR systems on this use the IMU for all the up-to-date stuff. It’s a matter of — they just need to know at some point in the past, not too far in the past, then you can go ahead and say this is where you really were and then reproject through. So if it’s even a hundred milliseconds, ideally less than that but you could have a pretty long latency. As long as you know what time was, you can recorrect from there and it doesn’t have to be – because you don’t get it, there’s nothing that we’ve got that delivers everything right now. We want 2 milliseconds latency and you’re not getting that from an optical system in consumer gear at least.
Audience Male: You did mention that you’re getting the camera pass-through on the Gear VR.
John Carmack: Yes.
Audience Male: Have you been doing any augment reality kind of applications or work with the computer vision coming out of that, not just for position tracking?
John Carmack: Okay. So if anybody’s going to do an augmented reality application, you must not stretch the camera larger than what it is, where the camera on the Gear VR is a 60 x 45 something like that degree, which is this small area in VR. And you could go do in AR experience with that and we’ve got the 120 Hz cameras, so it’s way better for doing that than any kind of standard AR experience using the 30 Hz camera interface. But I see lots of people then go ahead and take that camera and say, I want it to be big and stretch it over everything, which then becomes far worse than any scene we’ve ever created and committed in any of our distortion calibration stuff. You don’t just stretch it up over like that.
So there’s some work that’s been done on them. I haven’t seen anything that — I’m not a big fan of AR cell phone applications. Really I have seen a lot of demos and I think that they are technologies looking for a product, looking for a reason to exist. I have played with a bunch of them and I am yet to be won over. It may just be a matter of that the quality is not good enough, that it needs to be 60 Hz with 120 Hz scanner, updating all the time, not dropping and popping and having all the different problems. If augmented reality with submillimeter precision and no jitter, with low persistence might be a very cool thing but I’ve yet to see it.
Audience Male: Hi there. Irrespective of the resolution or the pixel density, the size of the device and the screen actually, do you feel the current size for the Note 4 is a good one, it’s smaller or larger would be better from the trade-offs with the weight? How do you feel about that?
John Carmack: It is remarkably close to optimal. It is very close to twice the average IPD for people. It’s about where you want – it interacts you over the lenses. We waste a little bit of display off the top and bottom with the current lenses. One of the big things that I want to push for is non-circular lenses where ideally we would have lens shape to project an image that’s the full rendering a square image in software and then we’re throwing way to corners of it by the shape of things.
So I think it’s actually a pretty darn good trade-off. I mean obviously we would be better if we had separate [cantonal], interacts OLED adjustable displays but if it’s got to be one display, it’s almost perfect in size.
And I guess unless they literally chase me off the stage, I can stay here answering questions until –
Speaker: Hi John, we actually are going to end for the lunch. And if we can just take the rest of questions offline, that would be great. Thank you.
Related Posts
- Why eSIMs Are the Smart Choice for Mobile Businesses
- Transcript: Elon Musk and Jensen Huang Talk AI at U.S.-Saudi Investment Forum – 11/19/25
- The Minds of Modern AI: Jensen Huang, Geoffrey Hinton, Yann LeCun & the AI Vision of the Future (Transcript)
- Transcript: NVIDIA CEO Jensen Huang’s Keynote At GTC 2025
- Mo Gawdat: How to Stay Human in the Age of AI @ Dragonfly Summit (Transcript)