Building AI Models Faster And Cheaper Than You Think | Lightcone Podcast

Episode Summary

The Light Cone Podcast episode titled "Building AI Models Faster And Cheaper Than You Think" delves into the advancements and possibilities of generative AI, particularly focusing on video generation. The hosts, Harj and Daniel, explore the capabilities of Sora, a cutting-edge AI that generates highly realistic videos based on prompts. They discuss the significant progress in AI's ability to accurately simulate real-world physics and the improvement in spelling and visual consistency in generated content. The episode highlights how foundational models like Sora combine transformer and diffusion models with a temporal component to achieve remarkable results in video generation. The podcast also sheds light on how startups, particularly those within the Y Combinator (YC) ecosystem, are building foundational models with limited resources. It showcases examples like Infinity AI, which creates deep fake videos, and SyncLab, which offers real-time lip syncing, demonstrating that it's possible to be at the forefront of AI innovation without massive budgets. The discussion emphasizes the importance of data, compute, and expertise in building these models and how YC companies have successfully navigated these challenges. Furthermore, the episode explores the broader implications of AI in various fields beyond entertainment, such as weather prediction, biology, and hardware design. It introduces companies like Atmo, which developed a more efficient and accurate weather prediction model, and Diffuse Bio, which focuses on generative AI for proteins. These examples illustrate the potential of AI to revolutionize industries by providing more efficient solutions to complex problems. The podcast concludes with a message of encouragement for aspiring AI innovators. It underscores that the field of AI is still new and accessible, and with dedication and the right resources, anyone can contribute to its advancement. The hosts remind listeners that many of today's AI pioneers started with little more than curiosity and determination, suggesting that the next breakthrough in AI could come from anywhere.

Episode Show Notes

If you read articles about companies like OpenAI and Anthropic training foundation models, it would be natural to assume that if you don’t have a billion dollars or the resources of a large company, you can’t train your own foundational models. But the opposite is true. In this episode of the Lightcone Podcast, we discuss the strategies to build a foundational model from scratch in less than 3 months with examples of YC companies doing just that. We also get an exclusive look at Open AI's Sora!

Episode Transcript

SPEAKER_05: A lot of the sci-fi stuff is actually now becoming possible.What happens when you have a model that's able of simulating real world physics? SPEAKER_01: Wouldn't it be cool if this podcast were actually an Infinity AI video? SPEAKER_08: One thing I noticed is like the lip syncing is like extremely accurate.Like it really looks like he's actually speaking Hindi. SPEAKER_02: How do YC companies build foundation models during the batch with just 500,000? SPEAKER_08: This is literally built by 21 year old new college grads.And they built this thing in two months.I think he'd like locked himself in his apartment for a month and just read AI papers. SPEAKER_04: You can actually be on the cutting edge in relatively short order.And that's an incredible blessing. Welcome back to another episode of The Light Cone.Today we're talking about generative AI.First there was GPT-4, then there was Mid Journey for image generation, and now we're making the leap into video.Harj, we got access to Sora and we're about to take a look at some clips that they generated just for us. SPEAKER_05: Yeah, should we take a look?Okay, so here's the first one.The prompt is, it's the year 2050, a humanoid robot acting as a household helper walks someone's golden retriever down a pretty tree-lined suburban street.What do we think? SPEAKER_04: I like how it actually spells out helper.It's like a flex.Like, I can spell now. SPEAKER_08: Yeah, which was not true with the image models. SPEAKER_04: You'd always screw up the text in the image.Yeah. SPEAKER_08: Stable Diffusion and DALI were notoriously bad at spelling text.So that is a major advance that no one's really talked about yet. SPEAKER_05: I mean, it's wild how high definition it is.That's almost realistic. SPEAKER_02: And the other really cool thing is the physics.The way the robot walks for the most part is very accurate.You do notice a little kind of like shuffle that's a little bit off, but for the most part is believable. SPEAKER_08: And the way the golden retriever moves.I have a golden retriever, so I can personally vouch that like they perfectly modeled the like. SPEAKER_02: Like your dog, right? SPEAKER_08: It's a perfect representation of how a golden retriever walks. I also like that with DALI and stable diffusion, as you made your prompts longer and longer, it would just start ignoring it and not actually doing exactly what you told it to do.And we gave it a very specific prompt here and it did exactly the thing that we told it to. SPEAKER_05: You can see it's still not exactly perfect.So I think towards the end you see it's like a floating dog, something in there. SPEAKER_08: Okay.I was going to call out a couple other imperfections here, which is that like the street is not a street, guys. SPEAKER_04: It's a carless society. SPEAKER_08: Yeah.And like what's up with that?It's like a weird, it's like not quite a sidewalk, not quite a street.Yeah.But in the future, we won't need cars anymore.And then like only one side of the street. SPEAKER_02: The structure is like jumping.There's this floating object thing. SPEAKER_08: There's a floating object on the right if you watch carefully. SPEAKER_04: Which looks like a little dog or something. SPEAKER_05: I'm not sure. SPEAKER_04: This is still a real breakthrough.If you look at some of the stuff that Meta put out, I mean, I always think about, what is it, Will Smith trying to eat a plate of spaghetti.And that looks insane.And it's sort of just what you would do if you fed the previous frame into the same model to try to generate the next frame.And it just wasn't durable.Yeah. SPEAKER_02: And that wasn't too long ago.Yeah. SPEAKER_08: The other thing that I find really impressive about the Sora videos is that they have long-term visual consistency.So it's like a minute long and like all the houses are similar architectural style.There's no like discontinuity.All the trees look similar.It's clearly all takes place in the same world. SPEAKER_05: Next one's a drone camera circles around the Golden Gate Bridge.The view showcases the magnificent cliffs and ocean waves with views of San Francisco in the background.The view is stunning captured with beautiful photography. SPEAKER_02: That is the Golden Gate. SPEAKER_08: That is the Golden Gate Bridge.He knows what the Golden Gate Bridge looks like. SPEAKER_02: And I think you can see Alcatraz there a little bit too. SPEAKER_08: Yeah, the high definition is amazing.And you can see the city in the background, as we asked for.It's definitely not geographically accurate.Yeah, the terrain is not quite actually the way it is in the real world.But it looks visually kind of similar. SPEAKER_05: Yeah.And you can see it's not quite perfect, because early on in the clip, if you look at one of the columns of the bridge from a particular angle, it looks disjointed.Can you see that one? Oh, yeah.The back.And then it sort of lines up when we get to this angle. SPEAKER_08: Also, if you go back to the beginning of the clip and you look at the cars driving on the bridge, they're driving on the wrong side of the road.Like that one's about to cause a traffic accident. SPEAKER_02: There's some data from the UK, maybe.Yeah. I guess the other detail is, in computer graphics, it's incredibly difficult to simulate fluid.And it's still a little bit wonky with it, with the waves.They're a little bit static. SPEAKER_08: Yeah.I've seen other Sora clips where it captures the motion of water just incredibly. SPEAKER_05: One thing I'm really curious about is just how Sora works under the hood and just how they're generating these videos.So, Daniel, could you give us a brief, like a primer on just like what's actually going on?And one thing I was particularly curious about is like, is this like a new model?Or is this like an extension of the Transformer model that we all know about as powering ChatGPT? SPEAKER_02: I think the TLDR and the really cool thing here, it is really a combination of a transformer model, which typically has been mostly used for text, and a diffusion model, which has been used in, which is a lot of the tech behind daily mid-journey to generate images. So it's combining these two and then adding a temporal component so you can see the consistency between frames and the time.And I think the key thing that OpenAI did was to train this with videos and with what they call space-time patches. So it is basically this 3 by 3 matrix of pixels.So you have the space spatial.And then patches of temporal, which is multiple frames create a video.And the way they do it, they have a variation of the sizes of these patches.It could be a certain size smaller to bigger in x, y, z, basically.And then they basically train all this in this giant architecture, which is really expensive. SPEAKER_05: And so are the patches, are these space-time patches the video equivalent of tokens? SPEAKER_02: Sort of, because I think there's a lot of prior work behind Sora, because the first thing is transformers have been mostly applied to text.And one of the prior work arts was Google's work on demonstrating that you could do transformer models, not just for English text, but for images.So that was a foundational paper that came back in, I think they published it in 2020. And the paper was called Images Wore 16 by 16.So they call it a visual transformer.So they demonstrated that you could create and use transformer models for image recognition tasks because the state of the art up to then was convolutional neural networks, which was very expensive to compute.So that was one piece of the puzzle.The other piece of the puzzle was kind of the space-time concept.And I think some of that comes from stitching some different work On the past, there's this other paper, World Model. They came out in 2018.It's for robotics, actually.That separates the detection piece.So that's kind of the perception of the visual part.And then the other piece is the memory model for the temporal aspect.And the temporal aspect in the World Model paper uses RNMs. And then there's a controller model that combines it.So what, I mean, they don't explain too much OpenAI.This is just a bit of just me looking at it.I don't know. This is one of those things that OpenAI is a bit cagey about it.But we can only speculate it's a combination of robotics papers plus transformer plus text. SPEAKER_05: And then how much more expensive is it to generate one of these videos compared to sharing the text?How do we even think about that? SPEAKER_02: Oh, man.So imagine the GPT-4 is like a trillion parameters.And that, imagine, is only two dimensions, right?Yeah.Text.It's just the matrix of 2 by 2.Now this is like an order of magnitude.So I can imagine it's like at least one order of magnitude.10 trillion? SPEAKER_05: That's amazing. SPEAKER_02: So probably 10 times the amount of GPUs.I can only imagine.I think it was about 20,000, 30,000.I forget exactly the number of GPUs that took for GPT-4.OK. SPEAKER_05: Well, what's crazy is that We have companies within YC that have also been able to achieve similar types of functionality, and they clearly have way less resources than OpenAI does.And so I'm curious how they manage to do that.And the way I think about this is that there's the components of building one of these foundational models, like data, compute, and expertise.Should we talk through some of the YC companies and how they've managed to hack Each or all of those things? SPEAKER_02: Basically, how do YC companies build foundation models during the batch with just 500,000? SPEAKER_08: Yeah, I think it's an important topic because I think because people know how much money OpenAI is spending on GPUs, there's this meme going around that in order to do this, you need to like have raised like billions of dollars and have like a data center full of GPUs.And we've actually seen that it's not true.There's actually a bunch of companies in the current batch Winter 24 right now that just in the time of the batch with just the 500K that YC gives them have actually built really awesome foundational models. that are producing magical results. SPEAKER_05: Should we look at some of these demos and talk about how they managed to get this to work?Yeah.Let's start with Infinity AI. SPEAKER_08: Infinity AI is a company in the current batch, and what they do is they make deep fake videos of a particular person.So, for example, they have an AI replica of Elon Musk, and you can just tell Infinity AI what you want Elon Musk to say, and they will produce a video of Elon Musk saying exactly that thing. SPEAKER_07: You want to watch a demo? SPEAKER_08: Yeah, let's see a demo.Let's watch the demo. SPEAKER_03: Speaking of IC companies training their own models, did you guys see the Infinity AI demo last week? SPEAKER_08: Yeah, they're a company in my group.Infinity allows people to make videos by just typing out a script. SPEAKER_01: Wouldn't it be cool if this podcast were actually an Infinity AI video? SPEAKER_05: That'd be super cool.You think they'd be up for that?Well, guys, I have a surprise for you. SPEAKER_08: So special thanks to the Infinity AI team who made a model for of the light cone podcast.And the way that they did this is they literally just downloaded our YouTube videos from the first three episodes and they train their model on that. And the cool thing about these models now is like you don't need that much data once you've trained the foundation model to adapt it to learn a new person.So just the like hour or so of YouTube video that we had was enough for them to get a really accurate representation. SPEAKER_02: I could talk about another company.So let's play with SyncLab this.SyncLab is an API for creating real-time lip syncing.And the crazy thing about this team is that they train the models on a single A100. and is generating these kinds of results.So let's take a look at it. SPEAKER_00: I'm guessing this guy doesn't actually speak Hindi.No.Okay.One thing I noticed that like the lip syncing is like extremely accurate.Like it really looks like he's actually speaking Hindi. SPEAKER_02: Yeah, and if we put it in this framework that you were mentioning, Harsh, with how YC companies do this, there's different vectors.There's computation, data set, and speed.So they kind of hacked all of those.So for the data set, the clever thing they've done is Unheard of training a video model, video audio model with so little resources is they compress a lot of the data and use low res video.So you don't need the high res video because if you have a high res of 1080p versus let's say the 240p. one version that's like a factor, quadratic factor less, because it's two dimensions, right?So they've done that.The other thing that enabled them to really move a lot faster is the deal that we did with Azure, where we have a dedicated GPU cluster for companies in the batch.They've been able to iterate 100 times faster than they were before in the batch. SPEAKER_04: So a lot of companies out there, they decide they need to do fine-tuning, they need access to GPUs, and they just can't get it.Or you've got to pay an arm and a leg and prepay for a year in advance, and maybe you'll get it in 2025.But if you're in the YC batch, turns out you can get them. SPEAKER_02: Yeah.You have a million in credits, and there's no contention for resources.You actually get instant access within 24 hours for a GPU cluster. SPEAKER_08: But it's pretty cool because YC invests half a million dollars, but I think all the companies in the YC batch that train these models, I think they literally didn't have to touch the YC money to train the models.That was all extra free money unrelated to the YC investment.Should we talk about Sonato?So Sonato is another company in the Winter 24 batch. And they have built a text-to-song model.So you can give their model lyrics to a song and tell it who you want to perform the song.Like you can tell it, I want Taylor Swift to sing a birthday song for my dog.And it will make exactly that song.There's only like two or three models in the world that have ever been trained that actually do this.And I think Sonata is actually the best one. Oh, wow. And the really cool thing is that the founders of Sonato are literally like 21 years old.So Harj, to your point about expertise, this was not built by like PhD machine learning researchers who have been working in machine learning for like 10 years or something.This is literally built by 21 year old new college grads.And they built this thing in two months and they did it.Basically, they just taught themselves.They just like went online and they like figured out how to do it. SPEAKER_05: That is very impressive.Should we take a look at it? SPEAKER_08: Yeah.So this is a song that they made for the YC batch, and it's like a power march about Y Combinator.Yeah. SPEAKER_07: Is this how we're going to open the badge?That's a good idea.We need big orange banners behind us.And we have to wear military garb. SPEAKER_05: With orange armory. SPEAKER_08: Gary, we could do our own song for demo day.Oh my god. SPEAKER_04: AI generated this time.I think we have to now. SPEAKER_02: We have to. SPEAKER_08: This is very impressive.One thing I really like about this is you can actually understand the lyrics.It really does do the lyrics, but it really does sound like someone is singing it.This is the first time I've heard AI vocals like that.Yeah. SPEAKER_02: And to your point with Jared, there's another company that also didn't have the expertise of PhD in machine learning.It is called Metalware.They're building a copilot for hardware.And these were founders who used to work as hardware engineers at SpaceX, and they had to build all these hardware designs.So they were familiar with building hardware.And when they came into the batch, they decided to build basically a copilot for hardware design. And they didn't have much AI background, and they figured it out.And one of the cool things about them is they also trained a foundation model for this because there's no model available for this.And they were able to do it during the batch.And in that same framework, the things that they hack with data and computation, in terms of the data, they got away with using less data but more high quality. What they did is they took a bunch of figures and data information from textbooks on hardware.And they scan all of that and use that as input, which is clever, right?The other problem, the other thing, because they didn't need as much data, then they could choose to work with a model that's less computationally intensive.So they actually use GPT 2.5, which seems counterintuitive because the 2.5 GPT only has like 1 billion plus parameters. I think it's $1 billion, right?Yeah.Versus GPT-4, it's like a trillion.And they were able to get a way to use less computational resources because they use a smaller model and better data.And then they could do all these hardware design copilot tasks, which is really cool. So when you kind of constrain a lot of your tasks, and you're very specific, and the data set is very high quality, that's another way you could hack and build a foundation model during the patch. And therefore, all different kinds of applications, not just generating video text.There's one that I'm really excited in the current batch called GuyLab.They're building an explainable foundation model.Because one of the things with all these foundation models and deep learning is kind of this black box magic.Nobody knows what's going on.You put in the data and kind of predicts the label and you have no idea how that happened.Prior to deep learnings, you could because you could have the weights and understand which feature indicated and gave the weight for the label.So this team is building a foundation model that can explain the outputs. And they trained a model during the batch. SPEAKER_05: Nice. As a founder, when is it the right call to invest in building your own model versus just using one of the existing open source models and fine tuning and tweaking it to fit what you need? SPEAKER_02: Well, I guess it depends, right?Depends on what you're really looking to build.If you're in a very specific and it can be niche space, you can get away with training your own foundation model like the metalware guys.But if you're, let's say, doing something more with language, GPT-4 gets you quite further along.So it depends on the task too, right? SPEAKER_05: So if we're thinking about it as data compute expertise, we're basically saying expertise is maybe overrated.We're proving that if you're just smart and willing to read the papers, you can figure it out.Being in YC is one way to get around that.You can get credits and you can take some of that cost off. And so then is it like the data piece is sort of where all the edges, like if you can find high quality, sorry, I'm saving it, like high quality, but not like giant data sets, that's the hack? SPEAKER_02: Oh, yes.Let's talk about find. So Fynd is this company that's building a copilot for software.The answers that they're generating are even better than Stack Overflow. SPEAKER_05: MARK MANDELMANN- Oh, interesting. SPEAKER_02: YVONNE SEYMOUR- And these were also kids out of college with not a lot of background.And they'd done a very clever hack to build their own model for the data.They created a bunch of synthetic data for programming competitions.So they would have a bunch of those data sets generated. And that got a lot higher quality.Imagine that.It's basically infinite if it's synthetic. SPEAKER_05: It's interesting because I feel like synthetic data has been looked down on. SPEAKER_08: It was controversial initially. SPEAKER_05: Why was it originally controversial and why does it actually seem to be working? SPEAKER_08: It seemed circular.It seemed like it would be impossible for a model to generate its own data. and how like how can you learn from the data that you generated yourself yeah it wasn't obvious that such a thing could be possible it seemed to like violate some like conservation of energy i remember it was like the the meme that was going around on twitter was like the mosquito drinking its own blood and like this is how synthetic data works yeah um but then it turns out it actually works interesting SPEAKER_04: I think maybe this is related to the idea that some of these LLMs are actually capable of reasoning.And once you can reason, maybe that's the part that sort of spins up the flywheel and makes it possible.And there are other interesting analogs that I think there's a healthy debate out there whether or not this will come together.But you could look at self-driving car models are often trained on massive amounts of simulation data instead of actually real drive time. you know, sometimes by a factor of 10 to 1 or more.And that might end up being true for some of the generative AI models too. SPEAKER_05: MARK MANDELMANN, Is it possible that Sora will do that as well?Like, could Sora generate its own video to continue training and improving its own model? SPEAKER_02: Probably.I know OpenAI doesn't share much about their data sources, because that's part of the secret sauce, but 100% they're using video footage that's generated from Unreal Engine, probably, or Unity, one of these game engines, because they have a full physics simulator.So you could create multiple scenes of the same, let's say if you have the example of the car driving on the cliff, they could generate it from all multiple camera angles because what that game engine does, you can position the camera anywhere and you could basically generate all the footage on all possible camera views. SPEAKER_05: The physics part of this is really interesting.I feel most people when they seeing these Sora demos or just generally get this concept, your mind goes to, oh, this will be cool for generating films or video games, like entertainment.But if what you're saying is it can actually simulate the real world, there's probably going to be lots of further reaching implications for that.What happens when you have a model that's able of simulating real world physics and where does that apply? SPEAKER_08: Well, I have a good example.This company, Atmo, which we funded in 2020, they built their own foundational model for weather prediction.The way they did it is they trained a model on like, I think, 90 terabytes of weather data.They've programmed in a physics model of the world by starting with like actual like equations of physics. SPEAKER_02: A giant polynomial. SPEAKER_08: Yeah.It's effectively a giant polynomial.And it's so expensive to run.It has to run on a cluster of supercomputers.And it's so expensive to run.The only place in the world that actually runs this model is NOAA, the US government agency.They're the only ones with a supercomputer cluster that runs the physics model.And so every weather app that you go to, every weather channel, they're actually not predicting their own weather.They're just downloading the government prediction data and wrapping like a nice UI around it. There's only one actual physics simulation for weather like in in America.And no commercial company has been able to create their own version because it's too expensive to do it the old school physics-based way.And so what's really cool about Atmo is instead of using the old school physics way, they've trained a foundational model.And using machine learning, it's like a million times more efficient to run the same calculation or something like that.And because of that, this startup, which is only raised to seed round, is actually able to make a weather prediction model that is more accurate than the NOAA funded one that cost over a billion dollars. SPEAKER_05: Interesting. What's really surprising about the text-to-video is just how far-reaching the implications are.So you can go way beyond just generating video games.We can do weather.What are other examples of cool things that we could do if we can generate, have a physics simulator of the real world? SPEAKER_08: Well, there's a bunch of companies that are applying it to biology.Diana, do you want to talk about a couple of those? SPEAKER_02: Yeah.So it turns out all these foundation models are great function approximators for anything. SPEAKER_08: Any function.They're general purpose learning algorithms. SPEAKER_02: And the human body can be simulated with... With functions too.So one of the companies that we funded as well is called Diffuse Bio.They're building generative AI for proteins.So what they're doing is building these big models to be able to create new molecules for new types of drugs and new kinds of gene therapies.And in order to hack... this aspect of how do they make progress with not as much resources.They had a lot of expertise.This is different than the set of founders we talked about that don't come from the background in AI.Namrata, the founder, she published some very legit papers in Nature before this. She had a lot of expertise in terms of how to short-circuit the computation loop. What she did is build custom kernels on the model so that the whole process of building the foundation model is a lot faster with less resources.So that's one.The other company in the current batch is Pyramidal. SPEAKER_08: Do you want to talk about them? SPEAKER_02: They're building a foundation model for the human brain, which turns out they're predicting EEG signal, which could be used for all sorts of applications from predicting stroke to reading.At some point, your brain could be read, perhaps.And what EEG signals are, they're also temporal.So sort of like Sora.Sora has like the images plus images over a timestamp.So there's video. So EEG is the same thing.It's just an electrical impulse, but over a time period.So they kind of do something similar with chunking space-time chunk, but for EEG.So they're able to train this model. And the way they were able to train and iterate during the batch, they were experts in this space.So they also did a lot of hacks around the computation. where they found a way to divide a lot of the sequential data into chunks, sort of like what Sora has done.And that actually reduced the runtime complexity by quadratic, which is impressive.And they could get a single run of an iteration of an initial model with just 800 hours of GPU compute, which is really cool. SPEAKER_08: One thing that's really cool about that is if people sat down and tried to think of different applications for foundational models, EEG data would not be the one that would immediately come to mind.And to me, that suggests that there's probably a lot of other application areas like EEG data that just people haven't thought of yet. SPEAKER_02: Yeah.It's like, who would have thought that EEG is sort of like videos?It's just this whole concept with space-time.You space-time lots of things. SPEAKER_05: It's also possible that applications of AI that people thought would exist will now exist, like robotics, I think, is a good one.MARK MANDELMANN- Yeah, that's a huge one.MARK MANDELMANN- You remember, I think we talked about this in a previous episode about how when Sam was starting OpenAI, he talked about they originally thought that AI in robots and AI in the real world would be the first application. SPEAKER_08: And I remember I went over to the OpenAI office in the first year or two, and they had all these robots trying to learn how to solve the Rubik's Cube by reinforcement learning, which is also kind of an interesting side note because OpenAI is so wildly successful right now that it's easy to think that they knew that they had this straight line path to get there, but it was definitely not that.It was like a meandering path. they pursued a bunch of dead-end ideas like the reinforcement learning robots that didn't work. SPEAKER_05: Even the researcher working on transformer architecture at OpenAI was off in a corner, I think, at the start.It wasn't clear even within OpenAI that that was going to be the right thread to pull on. But so like the Sora and just like texture video is especially interesting.So again, if we have a real physics simulator for the world, like that potentially getting plugged into robots is like a breakthrough to make sort of the AI robot a reality.And we actually have a company in the current YC batch, Kscale Labs, that's working on consumer humanoid robots.That's cool.Yeah, and they have a pretty cool demo.It's very early, but like a lot of the sci-fi stuff is actually now becoming possible. SPEAKER_02: The cool thing about Ben, who's the founder for Kscale, he was the guy that built the foundation robotics model for Tesla. SPEAKER_05: Yeah.Oh, cool.He put it into the Optimus Prime robot as well.Oh, awesome. SPEAKER_02: The thing about the real world is governed by the laws of physics, and it turns out we have a bunch of equations that can describe it for different things, like weather.There's also the space, for example, there's this company that we funded called DraftAid that is building AI models for CAD design. So CAD follows a lot of the laws of physics with Newton, right?With force, shear, etc.And a lot of software behind SolidWorks and AutoCAD run on these really old kernels that basically, again, solve these giant polynomials of lots of equations so that when you do a design of a structure and you want to calculate the force and the tolerances, it's accurate because you don't want a building to just flop, right? And it's very expensive.I mean, whenever you build all these models in CAD and these kernels are super old and they kind of, at the end of the day, they run on these equations that compile, I don't know, to some wild thing like Fortran because they haven't not been updated.What Draft8 is doing, they are short-circuiting some of these with AI models that can do some of the predictions.So there's a lot faster and cheaper in terms of computation. There's a lot of computational geometry computation behind the scenes. SPEAKER_05: That's really cool.That's a perfect example of just a valuable problem to solve that the general purpose models just aren't going to get around to specializing it. SPEAKER_08: That's a great point.And there's a lot of startups that are very worried that if they go into AI, they're going to get run over by OpenAI or other foundational platforms. model companies and so one solution to that is like train your own model that's doing something else.Yeah, great point. SPEAKER_04: There's actually a YC company called Playground run by our friend Suhail Doshi that is a good example of actually you probably can go up against people who are really well funded and come up with something that is far better.What we're looking at here is the newest version of Playground 2.5. And they're hot on the heels of mid-journey, but at the same time, the models that they've actually even released to open source go toe-to-toe against the latest versions of stable diffusion, and in a lot of cases, way outperform that.And they've done it on far less money than Stability AI and other teams in the space.So I think Suhail and Playgrounder... really one to watch to sort of go toe-to-toe with Midjourney and in the long run potentially beat it because I would never bet against Suhail Doshi.That guy is a beast. SPEAKER_08: The image quality is super impressive. SPEAKER_02: That looks so cool and maybe some of the audience would have thought that Suhail comes from an AI background but he doesn't. SPEAKER_04: Yeah, he started Mixpanel before when he was 19. SPEAKER_08: And Playground is also an interesting game on something that Harj was talking about last night. which is the phenomenon of companies pivoting into AI.Because Playground actually did not start with this idea.When it started, it was a completely different idea.And a couple of years in, Sue Hale, after raising a bunch of money, Sue Hale hard pivoted the thing into AI.And he literally just taught himself AI.I think he locked himself in his apartment for a month and just read AI papers.And then he built Playground. SPEAKER_04: So don't be afraid.I mean, I think that that's one of the most interesting things that we've seen across many of these different examples, that if you're looking for a reason why you can't succeed, guess what?You're right.But on the other hand, the field itself is so new, so brand new, that if you spend six or nine months literally reading every paper and then meeting all the people who are in the space and they'll meet you... you can actually be on the cutting edge in relatively short order.And that's an incredible blessing.Totally. SPEAKER_05: It's a really important message, actually, because we're all grateful to SAM and OpenAI for bringing this field forward and making all of this stuff possible.But at the same time, all of the news headlines tend to be around the companies that are raising huge amounts of money or about Sam himself as a world celebrity at this point. you can actually compete with open AI for very valuable verticals and use cases by training your own model without having to be Sam Altman or having $100 million. SPEAKER_04: So we're out of time for today, but we could talk for hours about the crazy things that we're seeing in AI being built by people who are probably not that different than you who's watching right now.A lot of the world right now is looking at people like Sam Altman and Dario Amadei and some of the luminary figures who have really pushed forward the whole space.But remember, all of these people started someplace.And we hope that Y Combinator might actually be the place for you to start, just like it was for Sam Altman back in the day. that's it catch you next time