Apple Vision Pro: Startup Platform Of The Future?

Episode Summary

In the episode titled "Apple Vision Pro Startup Platform Of The Future" from The Light Cone podcast, the discussion centers around the implications of Apple's Vision Pro for startups and the broader tech ecosystem. The episode features Diana, a group partner at Y Combinator with a decade of experience in AR and VR, including a pioneering startup in the space, Escher Reality, which was acquired by Niantic. The conversation delves into the technical challenges and innovations behind augmented reality (AR) and virtual reality (VR), highlighting the Vision Pro's hardware and software advancements. Apple's approach, leveraging a pass-through video feed rather than an optical system for AR, simplifies some of the technical hurdles, enabling more immersive experiences. The podcast emphasizes the Vision Pro's potential to revolutionize productivity by moving beyond gaming-focused VR devices. Apple's focus on productivity, coupled with the device's advanced capabilities, such as eye tracking for variable rendering focus, positions the Vision Pro as a tool for professional and everyday use. This shift could pave the way for a new ecosystem of applications and interactions, much like the iPhone did for mobile apps. The discussion also touches on the importance of developers embracing this new platform, despite the initial challenges and uncertainties in adoption rates and market impact. Comparisons are drawn between the Vision Pro's launch and historical tech milestones, such as the iPhone and Tesla's product strategy, to speculate on the device's potential to be a transformative platform. The conversation concludes with insights into how startups and developers might navigate this emerging space, emphasizing the importance of genuine enthusiasm and innovation in creating applications that leverage the unique capabilities of AR and VR technologies. The episode suggests that while the path may be uncertain, the Vision Pro offers a fertile ground for the development of groundbreaking applications and experiences.

Episode Show Notes

In this episode of the Lightcone Podcast, YC Group Partners discuss the launch of the Apple Vision Pro and the potential of this new platform for new startups. This is a deep dive into the technical innovations Apple has made for this product, how this compares to the launch of the iPhone, and advice for founders interested in building in this space.

Episode Transcript

SPEAKER_03: How much of like the hard, interesting stuff Apple did is with the hardware in the Vision Pro versus the software? SPEAKER_01: You need to understand the real world in order to augment it.Technology of a self-driving car, but on a headset. SPEAKER_00: This is maybe where founders should sort of pay attention.Is this a good opportunity for startups?There's all kinds of new interactions that I think we have not figured out yet.What really truly takes advantage of this platform? SPEAKER_01: The dream has always been to get to something like this. SPEAKER_00: Welcome back to another episode of The Light Cone.And as you can see, it's not just any other day in tech.There are some new platforms that are coming up right now.You might have seen other places where there are reviews.We're not doing reviews today.We're going to talk about what these platforms might mean for founders and people who want to build things for a billion people.We actually have an expert at the table right now, don't we? SPEAKER_02: We do. Diana, who's a group partner at YC, before she worked at YC, she's been working in AR and VR for 10 years since the dawn of the Oculus before VR was a mainstream thing.In fact, her grad school research was in computer vision.So she's been interested in this from way before it was a thing other people were following.Diana, do you want to talk about your startup that you did, which was an AR VR startup, a really early pioneering one? SPEAKER_01: Yeah, we went through YC with a startup called Escher Reality.What we were building was an augmented reality SDK for game developers so that they could build multiplayer experiences and AR games and build the code once. So that it would work on any platform.So between not just iOS and Android mobile device.But the dream has always been to get to something like this or that or that.So that developers would write the code once and work across all devices. SPEAKER_02: And what happened to your startup? SPEAKER_01: So what happened is this took a lot longer to come to market.That's one thing.The other thing that ended up happening, we ended up getting acquired by Niantic, the makers of Pokemon Go.So I ended up heading a lot of the AR platform over there at Pokemon with Niantic, and we shipped actually a lot of this AR SDK into a lot of games.So millions of players are running our code, which is really cool. SPEAKER_02: So if you've ever played Pokemon Go, you've literally used code that Diana wrote. SPEAKER_01: And I'm so excited with this platform coming in and we can go dive deeper into it. SPEAKER_03: Okay.Should we take the headsets off so we can, we can talk? SPEAKER_00: So it's been a long road.You've seen this technology basically evolve over the course of a decade. Why AR?That's one of the big things here.Previous platforms may be really focused on VR and the gaming aspect.HoloLens from Microsoft seem to try to do the AR thing.What's going on with the Apple Vision Pro?Why is this important?Why are we talking about this? SPEAKER_01: Yeah, I mean, we have to go even back in the history of computing.Actually, the attempts of building augmented reality and VR headsets have been actually since the beginning of the first computer.Actually, the very first one was by this guy called Ivan Sutherland back in the 60s. So people have been thinking about it.It's kind of one of the dreams.And it's one of those things that really fascinated me.I think so much of it is in our consciousness that we want to make it really happen.But the challenge, why it has not happened, unlike tablets, phones, is that it's just really, really hard to make.So you bring up the Microsoft HoloLens.They had version 1 and version 2. And sadly, the latest version got scoped down, or the team kind of got let go. because they tried an optical approach.So the AR approach was they were seeing actually the real world, and then the digital content would be rendered just within the eyes and have very little field of view.It was actually the same approach that Magic Leap was trying. And what Apple is trying is actually more of a pass-through, which is actually more of a full high-risk video feed of the real world.And arguably, a lot of the technical challenges are a lot easier.And the hard part of optics is that is not a problem of Moore's law and just like brute forcing with more computation, more pixels.It is actually figuring out new physics and photons so that they render properly to the human eye.Because the human eye is actually very incredible. Your field of view is actually 210 degrees.So you put your hands behind your ears, you can kind of see them. And to have a display system that can really render all of that is so hard.And the other part that's really hard, which I want to touch upon a bit more, is our eyes are incredible at doing infinite ability to focus.So we can look close here or very far. And in some senses, you have to find a way to make that discrete for computers to work, right?Because computers just understand ones and zeros.And to get that working in a display is just so hard.And Apple has done some clever things with that. SPEAKER_03: That's different to the optical approach, because the optical approach is actually looking through to the real world?Or what's the difference? SPEAKER_01: Yeah, so if I'm looking at Jared right now, I'm actually seeing Jared, and if I overlay a digital information in the optical system, I would only overlay the digital information.And here, for the Vision Pro and what the Meta Quest 3 or Meta Quest Pro or the Vision Pro, technically VR headset, the full video is all digital.Like, Jared is technically pixels when I see him through the Vision Pro. SPEAKER_03: And so you said the Apple Vision Pro being a video feed actually reduces the technical challenge? SPEAKER_01: Yeah, because I think there's a couple of things you could do.You can play a lot with the video feed.And one of the cool things, if you're really best in the world with display technology, what Apple is, you can get away a lot with it.And one of the cool things they've done and foundations of what they built, which is actually helpful if you're going to build apps here, so much of it is built upon eye tracking. So they actually have a variable rendering for focus.So they had to get eye tracking to be working so well for this to work.So in the Vision Pro, wherever you look, the pixel density of your focal point will render more high fidelity. than where it's not.And the reason why this is important is because to fit it in such a small form factor and not to burn and there's so much heat dissipation to push so much pixels and battery, you have to do trade-offs.So they did this thing of rendering more high res where your eye focuses. So you can notice a little bit in the periphery with the Vision Pro where it's more blurry or a little bit.It's not quite pixelated, but blurry. And some of the people do complain online with the foveated view.That's, I think, a bit of the artifact with the lens, but that's a different discussion. SPEAKER_03: How much of the hard, interesting stuff Apple did is with the hardware in the Vision Pro versus the software? SPEAKER_01: I think the cool thing about them is both, because the Vision Pro is sort of a culmination of a lot of the ecosystem of what expertise they built in iPhone.They have custom silicon.They have the R1 processor, which is a coprocessor to the M2.The M2 is basically the same processor that runs on the MacBook Pro, so very beefy.But that processor, M2, is for regular, kind of like a CPU, regular workloads. But the challenge for building an AR headset or AR in general, you need to understand the real world in order to augment it.And for that, you need a lot of sensors.So this has over 10 cameras, even has a lighter. It has a true depth camera.It has a bunch of IR cameras inside to track your eyes. So that's a lot of data, a lot of high data bandwidth that it needs to process.And underneath the hardware, I think you're going to get throughput blocked.So the R1 is a custom processor that processes all of the sensor data with very high data channel bandwidth. And I suspect they are even running a real-time operating system along the Vision OS, which is kind of interesting for what it means for developers to process all of this in real time.And it's starting to sound a lot like actually a technology of a self-driving car, but on a headset. SPEAKER_03: Yeah, that's exactly as you were talking about what this is, and that springs to mind, like LiDAR plus a bunch of cameras and processing the video feed. SPEAKER_02: Yeah, can you draw the connection?It's probably not obvious to people what the connection is between VR, AR, and self-driving cars. SPEAKER_01: Yeah, this was one of the jokes with my co-founder when we started Azure Reality with a cohort tech for localizing in the world and knowing where you are.It comes from the world in robotics called SLAM, simultaneous localization and mapping.So you want to find where a robot is in the world based on just visual data. And that is the same thing that self-driving cars look to navigate where they are in the 3D world.So you notice in a car there's 3D lighters, there's radars, there's a bunch of cameras. Same thing here, to know where you are in the world.So it's the same technical challenges, but with so much more hardware complexity, because you don't want to burn people's head.With this, imagine, because the self-driving car, with self-driving cars, you could actually, the actual hardware that runs in self-driving car processing, they put server-grade GPUs and CPUs, which fits in the trunk or underneath, But this is actually pretty cool what they've done.And they built a lot of that because on iPhone, they learned how to build custom processors. They built with the true face, true 3D on the camera, which is like IR for mapping in 3D. And LiDAR, they added on the latest one, latest iPads, and they've been building a lot of the ecosystem one by one. SPEAKER_03: Yeah, it's interesting to hear you talk about how Apple can build on their previous products.So it's like you're saying this is sort of a lot of the technology here is coming out of the iPhone.This sounds like this sets them up to build their car pretty well. SPEAKER_00: Same expertise. Let's talk about the use cases a little bit.One of the things that's pretty clear in everything about the launch of this is it's focused on productivity.And I kind of like it because when you're talking about these Oculus devices, they're much more focused on gaming, on VR, where you're sort of in a totally different place. Whereas my guess is one of the reasons why VR, AR hadn't been embraced is that it wasn't something that a busy person would use every single day.But now it's got the M2.It's the same chip that I have in my MacBook Air.I can actually, with a keyboard, do all of my work all day if I wanted to.And that's a really big difference in how they're positioning this device. SPEAKER_01: which is a big departure from Meta.Meta is so much on the gaming community, and actually there was, I think there's a bit of an uproar from the VR community that there's no controllers, and Apple has really focused full on on productivity, which I think if, this was my dream when we started Escher, that if AR was gonna happen, We're not going to notice it because it's going to solve all the very mundane things and it could replace all screens.I think you've done well.This is going after the market cap of all screens that get sold.You've done well.I mean, there's still a lot of things to be done.This is still B0, but yeah. SPEAKER_00: But this motion, this was incredibly natural.And being able to look at things and have it be something that you interact with, I was just blown away at how simple, how easy that was to reprogram my brain. SPEAKER_01: which is cool i think there's half i remember i guess a question for you gary do you remember when the iphone came out apple had this human interface guideline yeah they had a lot of things about communication and for communicating information hierarchy with touch and focus and gestures with your thumb and things like that yeah it was an incredibly comprehensive document they basically took all of the learnings that they had SPEAKER_02: gotten building the iPhone for years and they distilled it into a really thorough document and then they published it for everyone.I think it taught a whole generation of designers and developers how to build great mobile apps.They would just read that document. SPEAKER_01: There is a human interface guideline for the Vision Pro.And one of the things you notice is so much of it is about eye tracking and communicating information with depth and space. And I think what brings, maybe this is actually something for founders to think about if you're building an app in this space, is that with the Vision Pro, they invested so much on eye tracking to make it work for so many reasons.I mean, we talked about to get just the rendering to work.That was a building block.But for the UX, I think it is the moment that we're seeing with capacitive touch where Apple got it right for the iPhone.The eye tracking is starting to look a lot like that. So I think there's a lot of cool UX things that are yet to be discovered with just eye tracking.And the funny thing is that the VR community, I think, was very skeptical of this because it was actually a bad practice to do eye tracking because it tires the user too much.And the reason is because the hardware was not good enough. SPEAKER_03: I remember the same thing before the iPhone came out.I remember lots of the conventional wisdom from consultants and experts was that the virtual keyboard wouldn't work, that people wanted a physical keyboard and that people would never treat it as a serious device to do their email on because it didn't have a real keyboard. SPEAKER_01: On the phone. SPEAKER_03: Oh, yeah. SPEAKER_02: Yeah.That was all the reviews of the iPhone. SPEAKER_00: Yeah.Yeah. This is maybe where founders should pay attention.There were still things that Apple had not figured out yet that third-party developers ended up figuring out.If you remember the pull down to refresh, that was something that I think was in a Twitter client.I think that founder ended up selling their Twitter client to Twitter and working at Twitter for a while. There's all kinds of new interactions that I think we have not figured out yet.This sort of pinch to move around is merely the first of a whole bunch of different things that, frankly, developers will actually figure out, I think. SPEAKER_03: I'm curious also, Diana, what's the difference for a developer between the Meta SDK and the Apple Vision Pro SDK? SPEAKER_01: One of the big ones is Meta comes from the DNA of gaming, so they have very good support for Unity and Unreal, and those are game engines, which are cool to build for games, 3D environments. in a game, which are literally more like a constrained 3D world.But for spatial computation, the real world is infinite.So sometimes game engines don't quite fit.And one of the things you'll notice, to build an application that opens a PDF for the meta platform, it actually takes a lot of lines of code. Whereas to build that for the Vision OS, it's actually just a few lines of code. SPEAKER_03: Interesting. SPEAKER_01: I guess the other big question that probably a lot of people in the community have, is this an iPhone moment or a Newton moment? SPEAKER_03: Well, when the iPhone first launched, there wasn't actually an app store, right?So I think that came maybe a year later, something like that.All of the initial apps that got distribution on the app store were like frivolous apps, right?It was like the fart app.There's like a bunch of things that were getting really popular.The $2,000 I am rich app. SPEAKER_00: Yeah.It's like an image of a ruby or something.Yeah.Oh my God. SPEAKER_03: And if you think about from our, like at least the YC perspective, the iPhone or mobile didn't start driving really big companies being started until I would say probably like 2012.Like 2012 is the year where we had Instacart come through.I actually think mobile was a fairly big component of Coinbase, right?Like they had the fact that they just had an easy to use mobile app.DoorDash was 2013.And so all of these things started, and of course you had the rise of Uber, not YC company, but... it took so you could say five years from the launch of the iPhone for the actual good companies to even be founded and so yeah so you haven't missed it yet yeah well I don't when I'm when I think about the vision pro I'm not sure if we're at like is this the iPhone moment in the sense of the iPhone just got launched and um like it's still going to be a few years or is this like hey actually like this is this device has been around for a while this is just like the iteration that was needed on it to unlock like the instacarts and door dashes and ubers that are going to be built on it i'll give one argument for why it's probably more like the iphone moment we don't know but um you know when the iphone came out like SPEAKER_02: people forget smartphones were already an established category and the iPhone was like the new entrant to this like established category.A lot of people were skeptical that Apple could actually execute as you And as you mentioned, we're very skeptical of the iPhone as the right product to challenge the Blackberry and the other like incumbent smartphones at the time. SPEAKER_03: It's like the famous Steve Ballmer quote about it.I think there's like Steve Ballmer just, you know, making fun of it and saying it would never be a serious device. SPEAKER_02: Right, right, right.Why was it that it took like five years for the good iPhone companies to come out? SPEAKER_00: I think adoption had to happen.So that's why it actually maps very closely.I mean, I don't know how many Apple actually sold, but it's probably on the order of hundreds of thousands, right?So which probably mirrors the iPhone, maybe the iPhone, you know, broke a million.Even when you look back to the Instacart or DoorDash or Uber moment, these mobile workforces could only happen at the moment that 70 to 80% of the people in society were had these devices.And the reason why that was such an important moment was that was the first time normal average people had always on internet connectivity and an app ecosystem that was actually stable enough.You know, remember back, you know, the sort of 10 years before it was like J2ME or do we write it in Flash?You know, Gustav and his, you know, Voxer and HeySan experience, you know, the platforms were literally so broken and so fragmented that, you couldn't have 80% of the population on one platform, and then suddenly all of the platforms sort of coalesced, and then it opened up the market. SPEAKER_01: I guess the question with this device, and in general with VR, It will be different than mobile.It won't be a type of device perhaps, I mean, it depends on the price point when it gets to maybe phone cost perhaps, but it will take a lot of time before we get that level of mass adoption.But I think what could happen is it will capture a lot of the kind of high-end use with what we talked about earlier with high information density, construction, CAD, engineering type of workflows. SPEAKER_03: So Diana and I were actually doing group office hours yesterday with a group of our companies in this current batch who are all working with hardware, hard tech ideas.And we did this exercise, we call it the pre-mortem, where you sort of give them different flavors of how companies can die.And you get them to say, this is how I think I'm most likely to die.And the one I'm coming up with, the thing I... springs to mind here is we were talking about how Tesla's strategy was very successful to launch the Roadster, a very high-end device, and then you bring out the Model S and the Model 3 and the Model Y. But that wouldn't have worked if they just stuck with the Roadster.And so maybe one failure mode for the Vision Pro is this is the Tesla Roadster. It's great, it carves out a niche for people who are really into this stuff and are willing to pay for a very high-end device, but it can't follow it up with the Model 3. SPEAKER_01: I think there's a bit of a chicken and egg aspect with it because for this to be relevant to become the Model 3, let's say, we need a ecosystem of applications and incentive for developers to work on it.Because if I were a founder right now and I'm looking for a new idea, do I want to put all my eggs on here when there's not enough user yet?When should I do it?Should I just take a leap of faith?How do we advise founders when they're in this space?Why should they do it? SPEAKER_03: I definitely think that's relevant to the Instacart DoorDash thing, for example, if you think about it.Those companies weren't making a bet.Their apps were not specific to iOS or Apple.Everybody had a device.They worked equally well on Android.Frankly, they could have just been a WebView stuffed in an app.And so that's a good point. SPEAKER_02: And they also weren't the first entrants in their categories.Before DoorDash and Instacart, there were many would-be DoorDash and Instacart players that launched earlier that actually didn't succeed. SPEAKER_03: Yeah.Well, even more extreme, in their case, mobile actually made ideas that seemed very bad, like good ones.I actually think it's really cool that Sequoia invested in Instacart because they'd had the big failure with Webvan.And so they had all this egg on their face with grocery deliveries, this bad idea that you would expect is very natural to never want to fund that again.But mobile actually turned that into a good idea. SPEAKER_02: I did a dinner talk with Max, the co-founder of Instacart, and he said that when Sequoia led the Series A for Instacart, they gave him the web fan business plan that they had been given in the 90s.But the problem was it was on a floppy disk, and he couldn't find a floppy disk reader, so he never read it. SPEAKER_00: That's hilarious. I'm sort of taken by even the path of consumer social networks.Facebook started as the blue app.It was a desktop experience killing MySpace.It sort of looked like literally bank software, like if you logged into Facebook or Chase.com, it even had the same color.I remember being at YC when Mark Zuckerberg came to talk about why they bought Oculus. It was actually very much, from what I could tell, trying to fight the last war.That Facebook had just bought Instagram.I think it had not bought WhatsApp yet.And he felt really scared. Basically, Facebook had this monopoly.It had owned the industry of consumer social.But then they almost lost it because Instagram easily could have outstripped it. And that was because of a platform shift.And so he wanted to very clearly own the next platform.And he's right. SPEAKER_02: Should founders go build on this?Is this a good opportunity for startups? SPEAKER_00: I just sort of wonder what are the things that could actually fully take advantage of this in a real sort of professional context?I mean, where my head goes, maybe it's too obvious, but traders with their like sort of 20 screens, you know, wouldn't you rather have something that allowed you to take in the breadth of that information? and dive into it very easily just by going like that.You can imagine that being something that people are actually willing to pay not just hundreds of dollars a month, but maybe thousands of dollars a month for. SPEAKER_01: I think we're going to be in quite some time at the beginning in this awkward phase with spatial computing type of apps.Because even with the Apple SDK and meta, a lot of things are still flat 2D.And I don't think we know how to develop for full 3D. SPEAKER_00: What really truly takes advantage of this platform?What is unique about this platform, whether it's 360-degree view, being able to dive into more data easily?What are aspects of this new technology that mean that it can upend even what seems like an unassailable incumbent like Snapchat versus Facebook? SPEAKER_03: But would part of you try and talk them out of it?Would part of you be thinking this is too early, you should work on something else and not? SPEAKER_02: I think if you look back in our history, YCE has really been pretty good at this, where every time there's a platform shift, whether it's like the Facebook thing, which didn't go anywhere, or the iOS thing, which did go places, we were reasonably accurate at actually funding the right stuff.And I think the way that we did it is rather than having a strong thesis on each technology and each platform, we just kind of look at each application from first principles and we talk to the founders and they have some idea.We just try to figure out if the idea makes sense.I think that's what allows us to have had a pretty good track record of discriminating people who are just like cargo culting the new thing and just like jumping on the hype train and have some idea that doesn't really make sense from the people who are building something like DoorDash that actually like totally makes sense.Yeah, that's fair. SPEAKER_01: I mean, the other thing that I would look at, to Jared's point, is actually there's a strong belief from the founder that they want to make a bet in the space.I think there's just something about founders where they go all in, they become unstoppable. And it's going to take time.So they have to have the faith that this is going to be different than building, let's say, a standard SaaS application or consumer app or AI application, let's say.If you stick long enough, you're going to build a lot of expertise and be world class by the time is the right moment.But someone that's genuinely excited about it.And the cool thing about it, there's a lot of technical challenges with it, which I think is going to attract the right kind of founders because it's actually hard to build something good on this platform. SPEAKER_03: right now because it's so new so this will be the main thing i'll look for when i'm reading applications for people putting vr stuff actually and i feel okay sharing it because it's very hard to fake it's basically what we're saying is if you're the kind of person that just is irrationally compelled to build applications for vr we will happily fund you and like we need some evidence of that because you're just like you just like spend your spare in your free time you are like building vr apps and you have been for a while like yeah we never try and discourage founders from building stuff they just think is cool SPEAKER_00: Well, that's a great place to end.We're out of time, but thank you guys.Another good episode of The Light Cone, guys.See you next time.Catch you guys next.See ya.