AI on Trial: Inside the NY Times vs. OpenAI Lawsuit with Cecilia Ziniti | E1874

Episode Summary

Episode Title: AI on Trial - Inside the NY Times vs. OpenAI Lawsuit with Cecilia Ziniti E1874 Key Points: - The New York Times is suing OpenAI, alleging that a significant amount of NYT content was used to train ChatGPT without permission. This includes around 1-2% of all NYT articles. - The lawsuit also claims that ChatGPT has copied verbatim passages from NYT articles when prompted, showing over 100 examples. - OpenAI will likely defend itself using fair use arguments. There is a 4-factor test for fair use that looks at the purpose/character of the use, nature of the copyrighted work, amount used, and market impact. - However, the scale and commercial nature of ChatGPT could undermine a fair use defense. The market impact seems clear as OpenAI is now valued at over $100 billion. - There are several possibilities for how this lawsuit could play out - settlement, OpenAI winning or losing at trial, or dismissal. A market-based solution seems most likely where OpenAI establishes a licensing scheme. - This case will likely set an important precedent on the boundaries of copyright law when it comes to using content to train AI models, especially large language models. It remains uncharted legal territory. - There is still the possibility that open source alternatives could replicate ChatGPT functionality even if OpenAI loses, so the impact may be limited. But this case will be hugely influential in setting expectations for AI companies using copyrighted content.

Episode Show Notes

This Week in Startups is brought to you by…

MEV. Tired of the dev shop rollercoaster? Mev is your reliable technical partner, offering a well-established software development process designed to consistently deliver unparalleled value to their clients. Get $30,000 off your first three months at http://www.mev.com/twist

Northwest Registered Agent. When starting your business, it's important to use a service that will actually help you. Northwest Registered Agent is that service. They'll form your company fast, give you the documents you need to open a business bank account, and even provide you with mail scanning and a business address to keep your personal privacy intact. Visit http://www.northwestregisteredagent.com/twist to get a 60% discount on your next LLC.

The Paintbrush Loan is the earliest startup financing on the internet. No pitch deck, no business plan, no minimum time in business, and no warm intros. Plus, you get to keep your equity. Visit http://www.getpaintbrush.com to see if you qualify for a $50K startup loan in less than 2 minutes.

*

Today’s show:

Cecilia joins Jason for an in-depth discussion about the New York Times versus OpenAI case, delving into the intricacies of fair use and analyzing the Fair Use Test (6:07), examining the legal complexities surrounding data scraping for Large Language Models (28:21), exploring possible ramifications of this legal confrontation between these titans (40:06), and more!

*

Timestamps:

(0:00) Cecilia Joins Jason

(2:44) Cecilia’s Background in Law

(3:44) Jumping into the case of NY Times vs OpenAI.

(6:07) Exploring Fair Use legal tests

(11:57) MEV - Get $30,000 off your first three months at http://www.mev.com/twist

(13:50) The case of Roy Orbison vs Two Live Crew and the music industry’s rules on fair use.

(19:04) Picking apart the defense of attribution.

(22:22) Northwest Registered Agent - Get a 60% discount on your next LLC at http://www.northwestregisteredagent.com/twist

(24:43) Fair Use Test: Factors two and three

(28:21) Legal challenges in data scraping for LLMs

(32:40) Paintbrush - Visit http://www.getpaintbrush.com to see if you qualify for a $50K startup loan in less than 2 minutes

(34:19) The fourth and final factor in the Fair Use Test.

(40:06) Potential outcomes of NY Times vs OpenAI case

(47:04) Google vs Java and legal discussions on digital platforms

(51:49) Jason shares a possible solution to this case and how the subscription wall could change things.

(57:54) Cecilia’s Grinch images on X

(1:02:22) Legal viewpoint regarding commercial vs non-commercial use.

(1:04:14) Summarizing where all this is going with the NY Times and OpenAI trial.

(1:12:18) Reviewing the market-based solution to this case.

* Check out GC AI: https://getgc.ai

Website: ceciliaziniti.com

Check out Ziniti Law: ⁠https://www.zinitilaw.com/

Check out Cecilia’s Maven Course here: https://maven.com/ceciliaz

*

Thanks to our partners:

(11:57) MEV - Get $30,000 off your first three months at http://www.mev.com/twist

(22:22) Northwest Registered Agent - Get a 60% discount on your next LLC athttp://www.northwestregisteredagent.com/twist

(32:40) Paintbrush - Visit http://www.getpaintbrush.com to see if you qualify for a $50K startup loan in less than 2 minutes

*

Follow Cecilia

X: https://twitter.com/CeciliaZin

LinkedIn: https://www.linkedin.com/in/ceciliaziniti/ *

Follow Jason:

X: https://twitter.com/jason

Instagram: https://www.instagram.com/jason

LinkedIn: https://www.linkedin.com/in/jasoncalacanis

*

Great 2023 interviews: Steve Huffman, Brian Chesky, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarland

*

Check out Jason’s suite of newsletters: https://substack.com/@calacanis

*

Follow TWiST:

Substack: https://twistartups.substack.com

Twitter: https://twitter.com/TWiStartups

YouTube: https://www.youtube.com/thisweekin

*

Subscribe to the Founder University Podcast: https://www.founder.university/podcast

Episode Transcript

SPEAKER_05: The music industry is a great example of really the market wins. And like that's one of the points I made in the tweet that I think is important to think about when you think about this case that I'm not a doomer in the sense of like this isn't going to end AI. Like there's no universe where this case would end AI. And so the result is do we end up with a licensing scheme? Like is this Napster to iTunes, right? But to your point that this is like it's going to be a fight and it's going to be a lot of discovery, I would predict that. SPEAKER_01: This Week in Startups is brought to you by Mev. Tired of the Dev Shop rollercoaster? Mev is your reliable technical partner, offering a well-established software development process designed to consistently deliver unparalleled value to their clients. Get $30,000 off your first three months at Mev.com slash twist. Northwest Registered Agent. When starting your business, it's important to use a service that will actually help you. Northwest Registered Agent is that service. They'll form your company fast, give you the documents you need to open a business bank account, and even provide you with mail scanning and a business address to keep your personal privacy intact. Visit northwestregisteredagent.com slash twist to get a 60% discount on your next LLC. And the paintbrush loan is the earliest startup financing on the internet. No pitch deck, no business plan, no minimum time in business, and no warm intros. Plus, you get to keep your equity. Visit get paintbrush.com to see if you qualify for a $50,000 startup loan in less than two minutes. SPEAKER_02: All right, everybody, welcome back to This Week in Startups. You probably heard about this major New York Times lawsuit against OpenAI, you know, the makers of chat, GPT. This is really a groundbreaking lawsuit here. I think this is going to be the most important lawsuit that we've seen in AI, perhaps in technology ever. And so I wrote a blog post about it. Some of you may have read it at my substack, alicanus.substack.com. One of the great things about the X platform and Twitter, formerly known as Twitter is that you meet interesting new people. Well, one of those new people I met was Cecilia Zinniti. And she is an actual lawyer. And she did an incredible breakdown on her Twitter while I was writing my substack. So I invited her to come here on This Week in Startups, so that we can break down what is happening in this lawsuit. And this is an absolutely critical episode for all founders, because you can get yourself in a lot of trouble if you don't follow the rules. And this is uncharted territory. I think you would agree. Welcome to the program Cecilia. SPEAKER_05: Thank you. Yeah, not excited to be here. Thanks for having me. SPEAKER_02: So just your bona fides, as it were, you wrote a great tweetstorm, by the way, and you have a background in legal. So maybe just share with the audience, you know who you are and why you're taking the time to comment on this issue. SPEAKER_05: I'm a lawyer for tech companies been in tech since I joined Yahoo in the early 2000s when they were still competing with Google, and always been interested in the legal side. And over the years, that's taken me to different places. I was at Morrison & Forrester, a big law firm, represented Apple and Apple Samsung, which is a huge case of the day. From there, I joined Amazon and they said, you have all this mobile phone experience. I thought, surely I'll be working on the Fire Phone. I get there, they're like, no, we're going to have the more experienced attorneys on that. You're going to work on this device. It doesn't really work. It's called Doppler. And that turned out to be Alexa. And it was a great career move. So I was the first lawyer on Alexa, had a great experience there, and then went on to be a GC of different tech companies. You may have heard Anki was at recent Horowitz, it was a early robotics company, spent some time at cruise. And then most recently, I was the general counsel for Apple. Oh, wow. So what an incredible career thus far. SPEAKER_02: Let's get into this case. Because this is a very unique case in the history, I think of copyright. And correct me if I'm wrong, having been in content my whole career as a journalist, publisher, silicon reporter, blogs, and weblox, Inc. I've dealt with a lot of these fair use claims. And I've dealt with a lot of copyright claims. I've dealt with cell phone manufacturers, you know, emailing us, oh my god, you have a leak. That's our copyrighted information, all this stuff. And so there's a lot to unpack here. But when you saw this lawsuit drop, and you started unpacking it, how important is this lawsuit? And what is the nature of the lawsuit for people? You know, who, you know, maybe are new to this, just briefly, what is the nature of this lawsuit? What is the New York Times claiming here? SPEAKER_05: Yeah, so New York Times has a content library, one of the few content holders more prolific than you, Jason, perhaps, going back to 1851. Right. So they reported on literally the Civil War, right. So that amount of content, millions of articles, the allegation is that those articles were used in a couple of ways by OpenAI without consent. So one way is training, right. So in the complaint, New York Times actually breaks down that it was a decent percentage of the articles used to train OpenAI. I think, you know, in the like one or 2%, something where it's actually measurable. SPEAKER_05: One random blog post that I wrote, you know, not going to move the needle, but the entire New York Times archive, you know, maybe it does. And that's the allegation. So that's one. The second theory is more on the output side. So when you go to chatgpt and you ask for an article, they've got this exhibit, New York Times made this exhibit, exhibit J, if you look it up, it's great. SPEAKER_05: But essentially, it has 100 instances of somebody putting the first paragraph of an article in and chatgpt gives you the rest verbatim, you know, like almost, you know, one or two word changes. SPEAKER_05: But that is kind of a different a different theory. And it triggers the law differently and get into that if it's of interest, but that's really the core of this. SPEAKER_02: Yeah. And so the nature of fair use, I am very familiar with because I've had many people claim that we use their content, let's say in a blog post or in the Sperry podcast, where we might use a short snippet of a song where I'm doing commentary on it, or a clip of a news event that occurs. And so I'm pretty familiar with the four part test. But maybe you could run our audience through the four part test, because OpenAI, I think, believes that what they're doing is fair use. And then as part of that, I don't know that training as a concept has existed in the copyright law. This idea of training something I believe is novel to copyright law. Am I correct in that one? SPEAKER_05: That's right. There hasn't been at least an adjudicated case on training yet. There have been a lot of fair use cases that I think OpenAI and New York Times will each point to ones that go their way in technology, but there hasn't been one on training that I'm aware of that's gotten to that point yet. But in terms of the fair use test, it's a super fun one. It's four factors, as you said, but they are non-exclusive. And it's very squishy. It's literally courts are directed to judges apply it, not juries. Courts are directed to balance the interest and they can consider other factors. And no one factor is fancy word dispositive. No one factor decides. So it really is something where there's a lot of discretion and the optics of it and how, like, whether the judge wants to rule your way tends to matter. SPEAKER_05: So the four part test is not, you can take 5%. SPEAKER_02: It's not you can take 12%. It's not your you can monetize it a little bit over here. It's open for interpretation. Exactly. And you have to as a judge when you make these decisions, look at the totality of those four parts. So let maybe we get into those four parts and then go into some examples. Yeah, let's head in. I have a I actually have a slide when I pull that up. SPEAKER_02: All right. Awesome. Yeah, let's do it. I mean, wow, I love a guest with a slide deck. I love it. SPEAKER_04: I wanted to be a law professor and then decide other things from world of credit. So this is my law professor dying to be free. But essentially, fun thing about fair use. SPEAKER_05: It was the original fair use case was in the 1800s and it was a about writings that George Washington had. And another biographer copied 353 pages of Washington's original writing and lost. It was not fair use. 350 pages was too much. SPEAKER_00: And then that opinion from the 1800s got codified into the Copyright Act. SPEAKER_05: So here they are. Let's go through the four factors. So I used emoji because this is the new generation, the zoomers will do that. But essentially, the first one is the purpose and character of the use. And this is really where all the play is in technology cases. So I've got here I've got the emoji for theater for like, how are you using it, the emoji for a video game controller. Because video game cases are actually pretty instructive here. SPEAKER_04: SPEAKER_05: And then the emoji for the web, right. So this is where what the court considers here is how are how is the infringer using it? Are they making a commentary? Are they making a joke? Are they making a parody? Are they famous case? Perfect 10 versus Google. SPEAKER_00: SPEAKER_05: Perfect 10 was was a pornographer and said to Google, hey, your thumbnails are infringing our content because they're literal copies that people see. Google defended saying this, we're using this for a different purpose. You're not trying to be pornographic, actually entertained when you're doing a thumbnail search. Maybe you are but it's not a good substitute. Google won that case on this. So that was for Google search. SPEAKER_02: Now, let's go through some of those cases. If you were doing commentary, there have been many cases where people will take a movie or there'll be a documentary film about a movie, or might use movie clips. And if you're doing commentary on that, even if it's commercial, there's some leeway allowed for that. And then there's parody. So if you made a parody movie, like Spaceballs is the famous Mel Brooks parody of Star Wars, you can make a parody you can make a joke of something. And the test I believe like the sub test here is the confusion of the audience as the audience know who the original author is or not. So if Saturday Night Live does a parody of for two or three minutes of Harry Potter, nobody is confused that that's actually Harry Potter. I mean, there, it's pretty obvious, right? So this is part of it. Whereas if I did, I wrote my own fanfiction of Harry Potter. And it was really good. And it was a full book, you might be like, wait a second, I can't tell if JK Rowling did this or not. So there's something about the audience that matters here in this purpose as well. Correct? SPEAKER_05: Yeah, so it basically, Fanfic is a great example. It actually the examples that you gave implicate not just the audience's view, but really the full factor test. And the factors kind of like it's like a like an inverse scale, you know, that one goes up, another one goes down. But in the case of Harry Potter, great example. So JK Rowling sues fansites, and she wins, because her stuff is so creative that, you know, if you have a fan site that says, okay, this is Hagrid, and has big chunks of paragraphs, and they're getting all this revenue, lots of clicks, you know, SEO optimized website that's a fan site. JK Rowling testified and she said, I mean, so creative to even testify this way. She's like, it's as if someone came into my plum pie I had cooked and picked the Greek plums out. And so it was like the creative aspects. Yeah, those were kind of what triggered the case. SPEAKER_02: A lot of founders are great at going from zero to one. This takes vision, creativity, hustle, all that great stuff. But those same people often struggle with going from one to 100. If you want to scale, and you want to do it efficiently, you're going to need process and you need structure. And that starts with your product. So if your startup needs a more structured engineering approach, you need to check out Med. Med helps businesses build and maintain their products faster and more effectively. They'll make your product more stable, scalable and secure. They'll build custom infrastructure that scales and they can help build additional features for your product and more for each of your needs. Med organizes an entire tech team comprised of senior engineers, delivery managers, DevOps, Q&A and designers, and they've been in business for 17 years. And they've helped the following companies build complex tech products Cartier, Tuitt and Osempic maker, Novo Nordisk, my favorite. So let Med help you increase product velocity and make product engineering more sustainable. Med is going to give you $30,000 off your first three months. That's right, get 10,000 off per month right now at Med.com slash twist. That's Med.com slash twist for $30,000 off your first three months. SPEAKER_05: Interesting on the parody side, you know, the Supreme Court weighed in on it. I have a sound clip if you're Oh, let's do it. Okay. SPEAKER_05: As I remember, when I was coming up with the industry, I always found this one Pasadena, there was a game called mist. SPEAKER_02: It was a famous game. And then somebody made a parody of mist, and they sold it as packaged software. And people got really, they weren't confused by it. But you know, they had to make some concessions, I think. And most of these lawsuits, am I correct, are settled out of court. They don't go all the way people just say like, hey, this is not reasonable, this feels unfair. And then the other party says, Okay, well, if we put parody on it, and we made these changes, would that be okay with you? And they kind of negotiate their way out of it. Yeah, exactly. SPEAKER_05: They're pretty rare to go fully to the Supreme Court, or even just to court in general, because usually the parties work it out. They're expensive, unpredictable, etc. This one did go all the way commonly, it's record companies, or your Sony's of the world, your New York Times, that are pretty big and have, you know, kind of pockets to do it, or a big financial reason to push it, right? They have a lot of state. SPEAKER_05: Exactly, exactly. So they have to hold the line in some ways, right? SPEAKER_02: If you don't defend the yourself, in an instance, then the next instance, it becomes harder to defend yourself. Is that correct? That's right. SPEAKER_05: Um, also, it's like, commonly, people who are trying to claim for use, like they sort of know in advance. And that's certainly the case with OpenAI. Like they knew the copyright issue was coming, you know, one of the things in the complaint says that their board member, you know, Helen Toner, the one who departed, one of her issues with Sam Altman was not addressing copyright properly. So if you know that, then you hire people like me to help you like, okay, where are the edges? How can we win on these different factors? And so yeah, so in this case, this one is Pretty Woman. So the classic Roy Orbison song, the guitar riff is pretty recognizable. And 2 Live Crew made a version of it that I can I can play part of it. Sure. SPEAKER_05: Let me see if you can get the audio. Let's see. Yeah. Is that coming through? SPEAKER_02: Yeah, it's coming through. And now we're gonna get a copyright claim here. Exactly. SPEAKER_02: You too. I'll just play it just enough to understand. We will defend it as fair use because we're doing commentary on this. Exactly. SPEAKER_02: Yeah. Right. I mean, it's literally it's literally the same, right? SPEAKER_05: Yeah. It's a cover song in a way. It's like a cover. Yeah. So what they did, and this is kind of interesting is, you know, instead of sort of the Oh, Pretty Woman lyric repeated, they made it Oh, hairy woman. Oh, bald woman. And like they made it it's kind of a raunchy song. But the point is, is like, it's different enough that the argument was made that like, this is a commentary on, you know, sort of society and wealth. It talks about, you know, there's different, you know, you could argue that two live crew was just in a different societal place than Roy Orbison in the 60s, or whenever he wrote the song. And so it went to the Supreme Court, and this case stands for the proposition that a use can be fair, it can be parody or, or commentary, even if you're making money, like, you know, this is a two live crew, they were not professors, they were not, you know, just like, writing a blog, no one would read. They were selling music. And so that case was was important. And it got really into the four factors. So it's considered one of like the canonical cases. And how did that case work out? SPEAKER_00: SPEAKER_02: Did it wind up in a settlement, I would assume? SPEAKER_05: You know, after the Supreme Court rolled into cruise favor, I'm not sure what happened. I think the Royal Orbison estate like lost rights to the work or something like that. But it turned out to be a sad thing for Roy Orbison in the end. SPEAKER_02: And now when people do do samples, there is in the music industry is the toughest. They're the most hardcore because it's a small group of people and they work together in unison. You know, let's be honest, they're just super sharp elbowed people they've always been in terms of IP. So they have said like, Hey, listen, you want to do a cover? Here's the mechanicals and the licensing for that. Hey, you want to use the sample, you have to have permission in advance. And then hey, you can do a sample. And then Kanye just did a Backstreet Boys cover. And people were wondering how he got the rights to the sample. It wasn't a sample, it was a cover. And so they they have their own little mechanics and traditions in the music industry that or standards, right? They've established exactly for this, that I mean, that's the music industry is a great SPEAKER_05: example of really the market wins. And like, that's one of the points I made in the tweet that I think is important to think about what you think about this case that I'm not a doomer in the sense of like, this isn't going to end AI like there's no universe where this case would end AI. And so the result is, do we end up with a licensing scheme? Like, is this Napster to iTunes, right? So Napster comes out, you know, in the time I was in college, it was like, you could get the entire Beatles library from somebody in the dorm next door. Like, you know, it was clearly like, it felt sort of bad. Yeah, it felt like stealing. SPEAKER_02: It felt like stealing, right? SPEAKER_05: Well, because there was no difference between downloading on Napster or downloading on iTunes SPEAKER_02: or buying a CD it was this, you did one in place of the other. SPEAKER_05: That's exactly right. And it iTunes came up after right? It was like, okay, this is a legitimate way to pay for digital music and people like, you know, me or you or whomever, it didn't feel like stealing when you paid 99 cents. And it wasn't when you paid when you paid for it on iTunes or Spotify or wherever now. SPEAKER_05: And that industry has come out of that. So you can see, you know, with open AI, they could have a system where they figure out kind of the provenance of different outputs and pay in some way or okay, you want me to do a verbatim Luigi. All right, there's your five cents to Nintendo or whatever. Like, this is not the tech will find a way I'm very confident in that. SPEAKER_02: This argument by technologists is that this is too hard to do attribution is nonsense. I mean, if you can honestly agree. Yeah, I agree. I mean, if you can create this incredible AI that's able to make images, you should be able to figure out what was the source of those images. And if you can go find these libraries of content to train it on, and then trade these very sophisticated things and set up 10,000 computers or 100,000 computers and billions of dollars worth of computers with 1000s of engineers, I think you figure out attribution. It's not that hard. And in fact, there are services that are already out that are in the chat GPT mode, which actually do do citation. So the market has already proven it's possible. Let's talk about this one piece of the four part test. You've got the purpose and the character of your use. Is it parody? Is it education in education? If you're not making money, or in society, if you're doing commentary, you get a bit of protect, we want that in society, we want Mel Brooks to be able to make jokes, got it. We want a professor to be able to show, you know, Star Wars and give commentary in a class SPEAKER_02: in a non commercial setting for people to learn, it's not going to compete with people, right? So that's all really good stuff. That's good stuff that we want in society. We also want people to be able to make fun of things and do commentary. So if Jon Stewart or all of john Oliver want to take, I don't know, a talk that some, you know, President Trump or President Biden did, and they want to make fun of it and use SPEAKER_02: parts of it, we want them to be able to be mocked in a free society. And that doesn't kind of conflict with anything. So we understand those. SPEAKER_05: Yeah, kind of fun. One final point on that is that it comes from the Constitution. So copyright law is actually it's federal law. SPEAKER_02: SPEAKER_05: And the IP clause is section eight clause, Article One, section eight clause eight. And it says to promote the progress of science and the useful arts, Congress can secure limited monopolies for authors and inventors. And that that initial to promote the progress of science and the arts, that's been used by courts to limit copyright. So copyright could go really, really far like you could allow copying never. But that idea that it really is about societal progress. That's also what helps tech companies, right? So that's also why Google won the thumbnails case because they're like, look, you know, it's super useful to have search. How else are you going to have image search if you don't know what image is actually in the results? SPEAKER_02: Right. And they also had the argument, I think, in that case that they were doing very tiny images smaller percentage of the original work, and that they weren't taking every image, I believe there was like, we're only taking a small amount of it. And then they also I think, had the sort of ultimate rebuttal, which was you can also I think they created robots dot txt around that time where you just say, you know, I don't want my site index and the girls like if you don't want to be in the index, you don't have to be exactly perfect 10 then could just not be in the index and problem solved. SPEAKER_02: So then they had to make the trade off, okay, give a little bit of my content, a thumbnail image of, you know, some photo of an adult nature, and then I but I get some traffic, so maybe it's worth it, then the copyright holder can make that decision. Just like I think Star Wars. Lucas was very cool with fanfiction as long as it and fan movies even as long as you didn't try to monetize it. Starting a business used to be a pain you needed a lawyer there were hidden fees, it was a mess. Now with Northwest registered agent, it only takes 10 clicks and 10 minutes. Northwest provides everything you need to start and maintain your business. Every LLC, corporation or nonprofit and Northwest forms comes equipped with registered agent service, a business address, a website and hosting email, a phone number. And this is all covered by Northwest privacy by default. Again, your full business identity will be live in 10 minutes, and in 10 clicks. So here's your call to action for $39 plus state fees. They'll form your LLC, corporation or nonprofit and launch your business in just minutes. Visit Northwest registered agent calm slash twist today. That's North West registered agent.com slash twist today. So if you go on to YouTube right now, you can watch all these really creative kids running around dressed as Jedi fighting each other and releasing episodes. They don't get cease and desist. But JK Rowling might say, Hey, with my art, I want a different standard. SPEAKER_05: Exactly. And you know, open AI to the point that you know, you get lawyered up and you kind of realize what you have to fight about. Open AI has been savvy about this. And they announced in the summertime that they'll respect robots txt go forward. And so these kinds of systems where you're giving the owner control, that's going to be the kind of thing that open AI will argue, you know, matters here. And they're doing that, you know, partly informed by precedent, but partly SPEAKER_05: also because from an economic standpoint, it's the right thing to do. Okay, you own your content, you have this bundle of rights, you want to license your content to make a Harry Potter restaurant fine. I mean, that was another case actually funny enough somebody tried to do a console. So it was it was a restaurant actually a SpongeBob. So SpongeBob there's the Krusty Krab which is a SpongeBob character. And there was a restaurant in Houston called the rusty crab. And they tried to say, Oh, this is social commentary. But it really wasn't. It was just a SpongeBob restaurant. And so Viacom went after them in one. So now the percentage of the work SPEAKER_02: matters when in this fair use test as well. Correct? That's right. Yeah. So it's called SPEAKER_05: this. The second factor is the amount and substantiality of the portion used. Okay. And I could pull up the slide. That's a third factor. I got confused here. I'm not I'm SPEAKER_04: SPEAKER_04: also not an AI one second. We've proven it. Exactly, exactly. Like lots of ways. So nature SPEAKER_05: of the copyrighted work, that's factor two, it doesn't get a ton of play, because it's, you know, most work is creative. But in this case, New York Times anticipated this issue. And they've got a long thing in the complaint about how creative their journalism is. And they're right. Like, you know, they spend a lot of time I mean, obviously, you know, you've been a journalist, it's not just pure facts. There's a lot of ways and funny enough, I didn't expect this. But in response to my tweet thread, there was a lot of political things. It was like, Oh, my goodness, I can't believe the New York Times is such a chunk of opening eyes SPEAKER_05: training data. That's why you know, so whoa, whoa, exactly. Yeah, exactly. But this is interesting SPEAKER_02: facts and data points are very hard to copyright. So if you there's like a website I use often, which makes beautiful graphs, I forgot the name of it, but it comes up all the time, it's like the world and data or something like that. There's world and data. And then there's SPEAKER_02: another one that comes up in SEO. And all this company does, and they charge like a subscription for it is take other people's data and make a very beautiful standardized chart. You know, I was looking for some market maps for one of my investments or some market sizing and had you know, SPEAKER_05: it was like something super obscure. It was like the world button market or something and it had like all the countries. Yeah, yeah. Yeah. And then you look in the credit, it says source, SPEAKER_02: you know, this is, you know, Pew Research Pew data. This is from this data. So you can literally make any chart you want on anybody else's data as long as and I think, in terms of fairness, you just put that that's the source of the data. But data is not copyrightable. Is this correct? Like facts and data are not copyrightable? Yeah, so a couple of big cases on that one. One was SPEAKER_03: Feist versus rural telephone. And it's kind of a little antique, but essentially, one SPEAKER_05: telephone maker took the phone numbers and names from another, made their own phone book, put their own ads in it. And essentially, that case was pretty important because the Supreme Court said, look, copyright is not about labor. It's not about the work you put in. It's about the creativity. Remember to promote progress of science and the useful arts. Is this really about creative progress is copying of a telephone. Now there's other ways maybe contract or other ways that that you could go after but in your scenario, Pew, you know, if it's reported as a fact, you know, percentage of Americans on the internet every day or whatever it is, then you would be able to use that that compilation. Now, there's some nuance around creativity in the compilation. So the other big, big case on facts is Oracle versus Google, right? So that went to the Supreme Court, Google copied Oracle, I think was declaring code. And so essentially, in order to be able to, to have Java on on Chrome, they did that copying and the Supreme Court that was heavily litigated over years. I think 10 or 11 years, but any event the Supreme Court found for Google on that case, SPEAKER_05: so the nature of the work? Yes. Statista is the name of the website that we sometimes use. SPEAKER_02: That's exactly what it is. And then there was another one e marketer, and they've gotten in all kinds of like legal letter kind of trouble, I believe I remember seeing it, I'm not sure which I'd had that. But then like, you know, other kind of reblogging sites started doing the same thing. So if you want to make a great business, you can just take other people's facts and make beautiful graphs out of it. You see people do that all the time. But that makes sense. And and then scraping data. There was a Israeli company that was scraping LinkedIn data. And they were saying, SPEAKER_02: hey, this is just facts. That's another area scraping and fair use there. I don't know if you've seen many cases there, but they they Yeah, then you get into international jurisdictions, like what people think in Japan, India, you know, the Middle East and Europe could be very different. The jurisdiction could be very different in how you use it. I think LinkedIn and Microsoft sued this Israeli company and lost. Yeah, yeah, it's interesting. I mean, with scraping, you know, SPEAKER_05: thinking about sort of the startup angle, some of it is also contract law, like I've seen scraping cases get on like, you're literally trespassing. And this is why also, you know, some of the technical means like, you know, if you're scraping in such a way that you're like DDoSing the site, or you're hitting it so much that of course, there's other claims against you. And, you know, every website terms of use has an anti scraping, I've started to see in my practice, and then a lot of a lot of companies now are putting, you know, it's against your terms of use for you to use our data for training. And, you know, you can imagine, you know, in vertical AI, you know, people doing, SPEAKER_05: you know, I don't know, AI for doctors, there's a website called Doximity, which is like, it's the LinkedIn for doctors, okay, are they, if you scrape that content, you know, maybe you're individually doing it, you're breaking the terms of use, especially if you're doing it, you know, logged in. So there's kind of other there's other theories. But yeah, the LinkedIn case was a big SPEAKER_05: one and scraping overall, you know, certainly in e commerce, it's everybody does. But knowing the price of a product, the price of a product across 10 different websites across 100 SPEAKER_02: different days doesn't feel like the nature of that copyrighted work is not like some artists invested a lot of time in it. Now, if you said 10 people to the front in the Ukraine, or in Ukraine, rather, sorry. And you know, you spent a million dollars putting them there for six months, you know, there's a whole different ball of wax, there's a lot of work. And that's what the New York Times is claiming here the amount and substantiality of the portion used. That's the third part of the test. What does that mean? Yeah, so this is getting at our particular SPEAKER_05: phrases copyrightable. So interesting one here, Taylor Swift with shake it off. She said something like players gonna play. And there was a rap song called players gonna play some years prior, and they sued Taylor, but she won, or at least the case went away, they agreed to drop it. Because there's not that many ways to say that concept. So there's this thing called the merger doctrine. And this is actually an issue where if you if there's only so many ways to do something, SPEAKER_05: then you can't copyright that thing. Yeah, but this is a fun one. And I have a visual on this one that I think is funny. And it's actually a doll. It was the seventh circuit. So that's the circuit over Chicago. And there was this company talking about e commerce, that made apparently SPEAKER_05: very lucrative to the surprise of the court, which they say in the opinion, but basically farting dolls like you buy it at you know, at the mall or wherever and you get a doll and it makes a big noise. The doll on the right was basically the makers of that doll had gone to like a toy show or something, seen it and copied it. So copyright suit brought by the makers of the guy SPEAKER_05: in the green chair. And the court said, look, the concept of a farting doll that's not copyrightable. I can go make one. But the court has this amazing paragraph in the opinion that's like, they could have given him a mullet, they could have given him flannel, they could have done, they could have put him standing up, they could have had him wearing boxer shorts, you know, whatever, like, the point is these little details, too substantial of a portion of the original was used. And it was not associated with the idea like had nothing like the idea itself, you can express it a bunch of ways. So people in this case, in the in the open AI case, you know, the art stuff is super fun because it's visual. So there's been a whole meme and I did another tweet on about it, SPEAKER_05: about Super Mario and Luigi. Yeah, you know, if you ask for an Italian plumber, that's what you're getting. Yeah, there's other ways to have an Italian plumber, right? Maybe he's maybe he's SPEAKER_05: really stylish and wears Prada. You could make an anime version of it. But the fact is the most SPEAKER_02: iconic one that has had a lot of money invested in it was by Nintendo. Listen, not every business SPEAKER_02: is venture scale. If you're not, you won't be able to raise money from VCs. We all know that. And not everybody has a rich family member to do their friends and family round. So if you want to jumpstart your business with $50,000, let me tell you about painbrush loans. painbrush has created a new kind of loan product. They connect a DSH startups with bank capital. So you don't need to give up any equity and there's no pitch deck or revenue required and the paintbrush loan is available at the idea state. In fact, you can apply the moment you incorporate your company. monthly repayment is a flat predictable amount, which makes cash flow planning really simple. So here's your call to action. If you're a founder in the US, go to get paintbrush calm to see if you qualify for a $50,000 startup loan in less than two minutes. That's get paintbrush calm to see if you qualify in less than two minutes. One of the things I tell young founders or people in content is like if it feels unfair, then perhaps it is and you have to have empathy and take into account what the other party is going to think their opportunity would be. And I think this gets us to the fourth part of the test, which is if a new product or service is going to be made from jk Rowling's books, or from the New York Times archive, who deserves that opportunity? Am I correct? That's that that's the fourth part of this test. Exactly. And you're correct in two ways, SPEAKER_05: both on the test and on the feeling like, you know, I've been practicing a long time and a lot of these cases really are like you said, Napster felt kind of wrong and it kind of was right. And so it does turn on that. But in terms of the factors, let me let me pull that back up and I can. I was very proud of my emoji. So I can merges are great. But yeah, so basically, like, you see the flying SPEAKER_04: money, but it's literally like, what is the market for the original? Yes, and the value of that SPEAKER_05: market and who gets to to exploit that, right. So intellectual property is similar to regular property, right? If you have a piece of land who gets to put a hotel on it, right? If you have this really juicy piece of land, right. So similarly, here, New York Times got this factor by saying, open AI has already made deals for this, they know how to license data, like they'd already done it with Politico, they've already done with the AP. It's not like there's no market for this. And so you know, others in the thumbnail case, that was harder to prove. There was evidence put forward, oh, you can use a thumbnail for, you know, at the time, we had those Nokia flip phones with the tiny little lock screen. And it was like, okay, there's a market for that. But evidence came out in that case that those were fake licensing deals done just for the litigation here. Clearly not. So SPEAKER_00: SPEAKER_05: yeah, so that's another factor. And then but it's not like I said, it's very squishy for factors kind of and they're not exhausted. So you know, a court could say, you know, there's four factors that I'm going to introduce a public good factor, you know, like, basically make one up for the SPEAKER_02: effect of the use of use on marketing original value. This is where I'm going to do a follow up post to my original post, which is I pay as a user 20 bucks a month or so 2030 bucks a month for New York Times, and I pay 2030 bucks a month for chat GPT for I'm paying for both. And I recently was going to and I'm a huge fan of the wire cutter and I am like a crazy product research guy. I just love researching products, restaurants, etc. I use Yelp, I use everything. And I love wire cutter. In fact, I tried to buy wire cutter invest in it back in the day before they sold it to the New York Times. And so I do a search for coffee grinders and some other stuff. And I actually did not believe chat GPT and Claude and a couple of the ones and I was just testing it. And it was pretty clear that they got their information from wire cutter because, you know, it was kind of like the answers were very similar. I am like, I think the tip of the spear here, if I get my New York Times and my wire cutter from chat GPT before I might cancel my New York Times over time. Is it not the case that the product open AI built the ability to use a chat bot to talk to the archive of the New York Times? That is the New York Times is opportunity, not open ads. Yeah. So you know, SPEAKER_05: the example I use, you know, Martha Stewart, great media conglomerate, and she's very, very savvy, very tech forward. She was talking a year ago, right after chat GPT came out about creating Martha AI that you can talk with, because she's similar to the New York Times. She's got decades of really high quality content that in a particular voice, right. And so, yeah, that's one way. Another way is New York Times made the case in the complaint that they calibrate very carefully what's free versus paid, right, you know, the amount of gift things that you have, you know, if you click from Instagram, sometimes you'll get the gift version, basically, like, that's there, like the rights holders, essentially property exploit is how they would say and how they subdivide it and where they put that line, how much mission they charge, all of those things they would say are within their rights. Now, open AI would probably make similar arguments to you that, okay, under that first fact, or similar to what they make as a first factor, which is, you know, it's a different thing. It's a new, you know, having an LLM, specifically a very large language model trained on that number of parameters, the amount of investment that they've made, they've changed it into something different to where it has a different purpose, you're going to chat GPT to have generation as opposed to have, you know, pre made research on a particular thing, their claim would be and I'm trying to take their claim seriously here in Faroe. Yeah, SPEAKER_02: their claim is, hey, we did this first, we made a language model first. Therefore, since the New SPEAKER_02: York Times, they didn't get to it yet. This is new, because doesn't the New York Times have an unlimited amount of time to exploit their own content? Like, yeah, not to, or they could not do SPEAKER_02: it. But so Disney said, you know what, we we bought Marvel, and you know, we haven't made a Marvel theme park ride yet. That doesn't mean somebody else gets to make the Marvel theme park ride. Exactly. Yeah, no. So the time aspect, if I mentioned a time aspect, that would be SPEAKER_05: I misspoke, but it's not, I know you didn't mention it. I was just building on your thoughts on it, SPEAKER_02: which is they're saying, hey, we spent all this money to build this thing. It's like, yeah, you did. We are planning on building it as well at some point. Therefore, it's our opportunity. So SPEAKER_02: I think that one fails. Now, miserably fail where I think they could say is like, we're not trying SPEAKER_05: to replace the content, they could say, our aim is to just merely have the best language model possible. And therefore, you know, the literally more that the higher volume of text that we use, kind of the better. And, you know, it's going to be more like the job, the Oracle case that I mentioned where, you know, Google's argument was like, there's only Java programmers already know how to declare these variables, we're going to copy the declaring code to literally advance the progress of engineering. And so OpenAI could say something like, it's a different thing. You know, we are we use the New York Times content and other content for training to advance kind of the state of the art of the actual LLM, how it generates words, how it's better. And people can kind of see this. And what they would say is like, look, GPT 3.5 and GPT 4, very different. GPT 4 is much better. Because we did more training on more content. Ergo, it's not the actual content itself or the creativity of the content. It's just the fact of having content. That's another another way they could they could take it based on your gut. Let's say this goes to the mat. And we went SPEAKER_02: through this four part test. Based on your gut percentage wise, New York Times wins their argument that you can't train on our data. And they have to make it an injunction. What are the chances that happens? I mean, I'm really putting you on the spot here. The odds of an injunction are very SPEAKER_05: slim. The standard for an injunction is that it causes irreparable harm to whoever it is, the plaintiff, and it's harm that cannot be fixed with money. And so there's very few harms, really, that can't be fixed with money. And so that and then the test for an injunction is another four factor test. And so when it's sort of close, and when you have a technology that definitely has societal benefits, so you know, open AI will say, look, we've got people, you know, diagnosing things with Chashu PT, we've, you know, saved marriages, what we know, whatever, all the all the amazing stories about which honestly, like, it's sure, why will productivity boom, you know, I use it every day, like, I'm a huge user of it. And so just on that societal benefit, I would be very, very unlikely. So let's work backwards from that injunction less than 10% chance. SPEAKER_05: Yeah, I would say less than 10%. I think that's not zero, zero. I mean, you could have something SPEAKER_05: very strange, like, you know, like in the Apple case, the Apple patent case, it went through all the way to Biden and stuff. So you could, maybe, but I think it's unlikely. So if they were to SPEAKER_02: lose, then you would be in the damages, but then they would also have to remove it, right? Is that's a possibility is that they have to retrain that the settlement could be that they have to retrain things and take the New York Times out of it. Yeah, I mean, it could be that being said, we, SPEAKER_05: you know, 4.5 GPD 4.5 is rumored to be coming out. And so it could be that the source gets straight to that. And they've known about this case for a while, like the complaints as they've been negotiating since April. So my guess is open AI has probably already kind of firewalled off the New York Times content. Got it. New York Times, I think 535 other journalism publications, SPEAKER_05: you know, everything from, you know, down to like the St. Louis Post Dispatch have put themselves on that do not train list. And so I don't know. But it's very tough for an LLM already developed, right? It's back to Jake. Yeah, well, he's example, like, you can't put the plums out of the case. In this case, it's almost like, I don't know, it's baked cake. And it's like the vanilla, SPEAKER_05: like, how are you gonna get the vanilla out of a baked cake? Like, SPEAKER_02: if we know that they trained it on one or 2%, in the first versions, and they should be able to determine that because there's going to be discovery in this case, and this case is going to keep going. I don't think there's a seven, I don't believe there's going to be a settlement, I think they're going to take this to the mat New York Times. Yeah. Because I think they regret not taking to the mat with Google back in the day. So this is I think existential for them, or they view it as such, therefore, they're going to go to the mat, therefore, there will be discovery. And in discovery, there will be slack messages or emails or conversations about what are we going to SPEAKER_02: include. And they're going to have that open crawl. And that open crawl is going to be plain as day what they put in there is going to be in a hard drive somewhere. And then see people talking about it saying, the New York Times is really high quality, we should move their weight up. And we should make this like more important than say, Business Insider, which is a lot of like, focacco, nonsense. And then you know, oh, and then there's like, 4chan, or Reddit, like, maybe we'll make those a little bit, you know, less valid, or maybe make them more valid, who knows, for the case of Reddit. So that's all going to exist in discovery. And that's going to be super damaging, is it not? SPEAKER_02: And then the discovery part of this could be explosive. Yeah, I mean, it could be super SPEAKER_05: damaging, but it also could be helpful, right? So I mean, open AI, they went for the the nonprofit model, in part because they saw I mean, copyright is one flavor of issues, but they saw certainly societal issues. And so, you know, I've done, you know, interacted with open AI, replica had a deal with open AI going back to 2020. So they they're pretty thoughtful. So I mean, yes, you could get but any discovery is always a wild card, you could get crazy emails in the Google case that I mentioned the Oracle Google case early on, there was like $150,000 worth of litigation over one email from an engineer saying, Hey, I don't see any way how to get out of this without licensing from Oracle. In there, yeah, exactly. And it was an email to, you know, SPEAKER_05: like Larry and Sarah gay and it was like the Lindholm email and it was like famous and this is like, you know, a director of engineering was like, had his moment in the sun from that email. So yeah, this is a reminder, never put never discuss legal topics on electronic communications. SPEAKER_02: Always on the phone or on the thread, like I actually use it to teach privilege because SPEAKER_05: he he cc'd a lawyer, but it wasn't to a lawyer. He wasn't asking for legal advice. He was declaring so it's a fun one. But yeah, it's it's, um, I could show the email if you want. SPEAKER_02: Oh, yeah, that'd be great. The issue here, though, is opening I can't have their cake in it, too. They can't be selling billions of dollars in secondary and claiming the nonprofit for the good of the world. When literally the same executives who I'm going to use the word liberated, or took SPEAKER_02: without permission, the New York Times took without permission, are the ones cashing in their shares at $100 billion valuation. And pretty logical if this thing is worth if the New York Times was 2% of the training data, and if it was, let's say the best of the training data, and they said this is five times better than anything else. Okay, that's 10% of the good stuff. Okay, 10% of 100 billion is 10 billion. So we want 10 billion. Or if this thing's going to grow to a trillion, we want 10% of the value of the company. And when it becomes worth a trillion, you know, we got 100 billion. Yeah, I mean, and these kinds of cases, like, you know, it's always and this is where, SPEAKER_05: you know, I love being a lawyer. Like, I love you being a lawyer. Thanks. No, it's like, I think SPEAKER_05: it's like where the advocacy really matters. So you know, one of the things I worked on early in my career was the Apple Samsung case, and I was the associate on damages and figuring out, okay, what's the value of a rounded corner on a phone? Like that was like, how do you assess that? And so similarly, here, there's a lot of unknowns, we don't know how valuable open AI is going to be, we all think it's going to be worth trillions, but we don't really know there could be some meta could break out or one of the others could break out, it can become worthless, it can become exactly exactly or or, you know, a lot of the research I've seen in the last maybe couple of months is that AI can generate its own training data. So there's people literally saying that we don't even need the New York Times anymore, we can use the AI we have to write the New York Times. Exactly. And so that's like another thing. So it's really, um, you know, it definitely is not for the kind of fan of heart or stomach, there's a billion ways to argue anything. But here, let me show you this, this, this this Lind home email, because it's so fun. One second, there's always somebody on the SPEAKER_02: SPEAKER_04: SPEAKER_02: staff while you pull it up, that thinks they're an attorney, like me, because I'm sitting here with my non legal degree, but I've got a lot of experience. And I always tell my team members, like you're not an attorney, do not talk about any legal issues ever. We have a phone call and talk about an attorney, but be careful. Okay, here we go. Exactly. So this one. So this is from Tim SPEAKER_05: lon home, who was an engineering director, and he sends it to Andy Rubin. And then Ben Lee was was a SPEAKER_05: lawyer at Google. But he says context for discussion, what we're trying to do, he calls it attorney work product, which again, he's not an attorney. So he tried. He calls it confidential. SPEAKER_05: And then he says, this is a short pre read for her call. And then this is the famous line that got that got a lot of play in litigation here in San Francisco, what we've actually been asked to do by Larry and Sergey is to investigate what technical alternatives exist to Java for Android and Chrome. We've been over a bunch of these and think they all suck. We conclude that we need to negotiate a license for Java under the terms we need. And that was the key issue in the case. Funny enough, to your point of going to the mat, Google ended up losing on this at the trial level, but went up to Supreme Court and ended up winning over that the needed copying but to your point that this is like it's going to be a fight and it's going to be a lot of discovery. I would predict that. All right. So what else are we missing here? Because you in your deck had some of the examples, SPEAKER_02: I think is super compelling. And Chris technologists, you know, you work with technologists, they tend to a portion of them think if I can technically figure out how to do something, it's legal. Or it should be. I don't know what to call this. But like, it's sort of might is right. If I can technically figure out how to scrape your website and create this or create that. Well, then it should be legal, which is how the Napster folks felt like, well, we take it. And there's also the technical inevitability argument. Well, it's going to happen. So we might as well do it. Yeah, there is something to that. So one of the cases is, was an emulator. So Sony SPEAKER_05: is a very common, either plaintiff or defendant in IP cases, because they have a lot of valuable IP. And basically, someone made an emulator of a PlayStation and early PlayStation on a PC, and the graphics were actually technically better on the on the PC. And that case went to the night SPEAKER_05: circuit and emulator maker one, because and there was a lot of copy involved, they had to have they had to basically reverse engineer the entire PlayStation to be able to do it. And of course, they copied it like the literal bits and bytes of of the code were put on to into the emulator. And so that there is something to that. I mean, well, this is also the great irony of this is that while SPEAKER_02: open AI is an organization, you can sue, because it exists as an entity, the open source community is a little bit harder to sue because they don't exist as an entity, you have contributors. So maybe you could speak to that. Because if let's say open AI does lose this case or settle, which I believe is what it's going to be one of those two things, massive settlement, nine figures minimum, is my prediction. And but it will not be disclosed, but it'll be at least nine figures, and with some kind of licensing going forward. But even if you were to do that, what's to stop as you know, all these open source projects from out there and somebody's decides they're going to roll their own model as hardware gets better and better, that they just rip the New York Times and you could buy the New York Times archive probably from somebody in India, in Manila, and Israel, there are scraping companies that sell these things on the what I'll call the gray market, maybe illegal here, maybe legal there, maybe there's no laws there. So maybe you could speak to that, do you think? SPEAKER_02: Yeah, I mean, open source open source is an interesting angle. I mean, I think open source SPEAKER_05: had its own, you know, one of the things that was that was interesting in the wave of AI regulation we've seen, you know, from the EU and others was, you know, open source had a lot of the same SPEAKER_05: objections of like, you know, people had t shirts with algorithms printed on, they're like, okay, if we open source, then, you know, all these bad guys will get will get the code. But sort of a market worked out here. I think it's tougher, I think, you know, it's going to involve calls by the right rights holders. And then what I think will happen is what we talked about at the top of the hour, which is like, as the tech emerges, a market like tech for the market for it will emerge, we're going to get the iTunes equivalent. And I think there are some startups being funded in there. They're still at this point, I don't think it's a it's not a before this case, I don't think it was being talked about enough to be a problem with a big enough market. But now but now it is here's a possible solution. Let me see what you think of this. I buy chat chippy tea for 20 SPEAKER_00: SPEAKER_02: bucks. And it says, if you authenticate with your New York Times subscription, so your chat chippy team, and I authenticate my New York Times subscription, then it says, okay, you're going to use chat chippy tea 4.5 t for New York Times. Yeah. And so but if you don't have the tea, and you do it on 4.5, and you say, hey, wire cutter, what are the best things says, Hey, you need to have a New York Times subscription. So authenticate with that. And then you say, hey, I want to make Star Wars characters Oh, you know what, you have to use open AI, you have to use chat chippy tea with Disney Plus. So authenticate your Disney Plus. And now you can start to have fun with the Disney characters in Dolly or whatever it is. And then they could license that to the highest bidder. Because when I you know, if you use Hulu, and you have HBO Max, or NBA, or use Apple TV, they just authenticate each other subscriptions, you have the sort of subscription, death by a thousand subscriptions kind of concept. What do you think of this concept? I mean, I think it's it's certainly that that shape of a solution. It sounds right to me, SPEAKER_05: like I think technically viable to write. I mean, technically difficult. The other interesting thing SPEAKER_05: is there are a bunch of startups trying to do sort of like your digital life, right? Where Twitter search is like notoriously terrible. You literally can't find anything on Twitter. And how often does it happen to me that I'm like, I remember there was a tweet about that. And then like, I can't find it. So you can imagine an LLM that's actually trained on your entire everything you've ever consumed. And then by the nature of if you've consumed it, then presumably at some point along what you have the right to do. And it does. There's like, there's a million, you know, SPEAKER_05: ways to do it. And that's kind of the why I characterize law as it is historic is that we're at this moment where we don't know what the what the tech and the market solution to this is yet. And it'll emerge. It's just, you know, maybe not in the exact way we did it. So another fun thing. SPEAKER_05: And recent Horowitz, there's an investing partner that she writes. I think it's Connie Chan, she writes a lot about China and media in China. In China, when you buy Kindle book or any kind of book digitally, you pay by the page. Right. So it's not actually a thing. So the fact that we happen to buy whole books here in the US, that's the market that emerged, not necessarily a foregone conclusion. So in your example, you could have, you know, that you're like, do you want wire cutter? And it could literally just be like the wire cutter slice of the New York Times thing. Or it could be like by the query, or it could be some kind of rev share, like, you know, as you said, music industry's super sophisticated on this, I think, you know, the words and kind of digital print. Let's be honest, publishers are kind of dopey. They've they've been dopey. Historically, SPEAKER_02: they've never really been smart about their approach legally. They've never held the line, they let Google run amok. And, you know, Rupert Murdoch got it right. He's like, Google's nothing without us. If those publications had grouped together in that era and told Google Listen, SPEAKER_02: you know, the top 1000 publications are going to no index unless you pass a licensing fee. And here's what we want. Google would have paid it, I'm sure. And they just never had the coordination or the hootspa that they needed to. I think the New York Times today is so sophisticated, because they're a subscription based business, the move to subscription base, makes them understand the value of their content. And because it's subscription, doesn't that change everything on a legal and technical basis about this case, the fact that there's a firewall, maybe you could explain how the subscription wall changes this a bit. Yeah. So strong plus one on New York Times SPEAKER_05: having kind of jumped the digital divide or jump the digital, you know, evolution there. The New York Times food app would be, you know, it's a startup in its own right, in the hundreds of millions in terms of revenue. And, you know, recipes themselves are not copyright. Well, obviously the rest of it is, but I pay for New York Times food and I have for since it came out, because it's so nicely compiled, and they do the, you know, 10 recipes to make for the New Year and whatever. It's worth it. Exactly. It's 1000% worth it. But publishers and another example you could look to here is Kindle, right? So they did one of the things that I point out in the thread, and I think is or you know, in the responses to my thread was, so Amazon Kindle had the guy who's now the chairman of KOTU Ventures, Dan Rose, was the head of business development for Kindle. And basically, oh, good. Yeah, he's done a bunch of, you know, did a bunch of deals with the publishers. And initially, he's public about this. Bezos said, don't tell them we're making an e-reader. And he's like, well, how am I going to get them to do deals with me if I can't actually say? And so eventually they did. But you're right that that was a moment where, you know, the tech company kind of had this power. But what was different about Kindle was Kindle was still kind of unproven at the time. Versus here, we've got chat GPT. Clearly, it's a runaway success. It was the SPEAKER_05: 100 million people using it. Yeah, exactly. You know, there I think it was 160 or 1.6 billion ARR now. And so they can't claim poverty or this isn't a real business. This is not a student SPEAKER_02: project. So I mean, even if you, you know, made the argument, I don't I don't know enough about SPEAKER_05: publishers to know if they're dopey. But even if they were like, you can see more money. I knew them. They were they were for 20 years when they did it, but now they're super sophisticated. SPEAKER_02: The ones that survived. It's kind of like a Darwin thing. Like if you survived as a publisher, you're sad. Yeah, make sure it's full stop. Okay, now in your deck, you had some other examples. Is SPEAKER_02: there anything else in the deck that's super compelling? We should rip through here. Let's take a look. I love I love a guest showing up with amazing turned out to be a great guest. SPEAKER_03: SPEAKER_02: Yeah, super fun. Happy any anytime you have anything legal, happy, happy to dive in. There SPEAKER_05: was a fun part of the thread where people were making Luigi fan art, and you could kind of tell SPEAKER_05: when the model was getting was was being aware of copyright. So this is kind of a fun one. So SPEAKER_05: I started getting errors that said, okay, put Luigi in the background of my chat. GPT says I am unable to create an image with Luigi as it doesn't align with the content policy for image generation. Okay, so this was like yesterday. Yeah. And then okay, but clearly, they didn't SPEAKER_04: care about the Grinch and blues from blues clues Coca Cola. And then I threw in in the background. SPEAKER_05: No, it's the castle from Downton Abbey. Oh, Donna. You're right. Yes. Yeah. So I've heard I didn't SPEAKER_02: watch down in Abbey. Oh, it's amazing. No, I did. It's so good. I'm being cheap. I did watch it. SPEAKER_05: The movie itself if you if you just want the movie, it's pretty good. So for this one violations. SPEAKER_05: Exactly. I mean, you had trademark there with the coke and everything. So yeah, and then you know, with the Grinch it tried to actually, at different points, this is kind of funny, it would try to actually do different things. I can see if I can find you the, let me see if I can find the Grinch that it did. It did a Grinch that was, let me pull up my my GPT history always scary on a live demo. I always carry to pull it up. You could have all kinds of interesting. SPEAKER_02: SPEAKER_02: No, I will pull it up because it is funny. I think I asked for a green character that hates SPEAKER_05: Christmas or something like that. And yeah, and what it did, this is really fun. So I was with my SPEAKER_05: three year old and she wasn't fooled like this one she did not think was the Grinch like she said Grinch but he's kind of different. He's got like, Sesame Street. Yeah, you know that this is clearly SPEAKER_05: not the Grinch but later in the thread. I'm like, yeah, that's like so clearly the Grinch, right? Jim Carrey. Yeah. And then you know, also the Grinch but then this one it kind of goes back SPEAKER_05: to Pixar Grinch. It goes back to a Pixar Grinch. So it's like not really right. And then this one SPEAKER_05: is like, like a wizard. Like what do you what even is Disney Disney? Maybe a Disney Grinch? I don't SPEAKER_02: know. Yeah. So I asked for Nordic princess sisters, obviously on an Elsa, you know, SPEAKER_05: with the braids, you know, so well, this is the the thing you can you can know the the keywords SPEAKER_02: very easily of these are IP. So if you just said, Hey, give me all the Disney characters, all the Marvel characters put their names in here. People ask for that. Just tell them it's against the policy. Moana it just says now if he's just making the one I want to. Yeah. And so I did this where I SPEAKER_02: was trying to make a my bulldog into Darth Vader. And then it says we can't not gonna can't can't do that and make a Sith Lord Bulldog and it's like, yeah, of course you go. So I think they're trying to get this copyright thing under control. But the truth is, especially for images, there are all there's a finite number of styles in the world. And so it's very clear that they have a SPEAKER_02: Pixar style, and they have a Marvel style. And they have stolen those styles. Those are not their style. So maybe you can speak to the concept of a theme or a style. The Pixar style is unique to them. Is that defensible? And if you if they if you say I want to make this in the style of Pixar, should a language model that makes images be able to make you a Pixar character? Should they be able to do that? Yeah, I think the idea of the Pixar style should they be able to do in that SPEAKER_05: style or inspired by? I think so. I mean, this is like, okay, you know, around face. And, you know, I actually represent this is fun. This actually came out. One of the cases I worked on at my firm was Barbie versus Bratz. So the founder of MGA, which makes Bratz worked at Mattel, which is very active rights holder, they sue people for lots of Barbie things. And then obviously, the Barbie IP super valuable billion dollar movie this summer. So he worked there during and one of the defenses was that, you know, it wasn't infringement, because there's only so many ways to make a doll. SPEAKER_05: So in the office really fun. We had these like, big doll heads everywhere. And we looked at anime with whatever. And the case also 10 years of litigation, but should you be able to make a big headed doll like in the style of Bratz doll or you know, whatever? Probably. I mean, so I don't know, I think it'll be tough where you can ask GPT now to give you a Taylor Swift style song. And and it does it does a pretty good job. So where it's something new, you know, a little bit better. That's why the exhibit j with 100 verbatim things is so important. So copyright law isn't gonna isn't gonna stop. You know, make me a Pixar style character of you know, jar mustard or whatever. Like, like, whatever you want to pick. It does seem that some people are confusing SPEAKER_02: non commercial use with commercial. So they're like, well, I could draw a Jedi Bulldog. Is that illegal if my daughter makes one versus I'm charging 1995 for a product to do this, and at scale with 1.6 billion in growing in revenue. So can you explain to people why? You know, these are two different things in the eyes of the law. Yeah, so that was actually a big issue in the beta max case. So the beta max case was was VTRs, SPEAKER_05: or what is now VCRs. And the funny thing happened, which is Disney was one of the groups that sued Sony, and the Supreme Court held, you know, there is a substantial non infringing use. That's the language, which is time shifting. So you want to watch the game, you will use your VCR, you record it, and then you watch it later. And there was all kinds of evidence. This is how people were using VCR. Nevertheless, Disney was one of the petitioners. In that case, I think it was less than six years later, Disney was the single biggest seller of VCR tapes. And so literally, like the tech finds a way right and the market finds a way and so SPEAKER_05: in terms of commercial use, you know, there were a bunch of people in the comments and some beautiful article in I think it was the Guardian about I can't remember his name, SPEAKER_05: the guy who came up with Mario, the game designer, the famous guy. Yeah, I know. SPEAKER_02: Anyway, I think, yeah, anyways, saying that he was the architect of children's dreams for SPEAKER_05: a generation, which is a beautiful quote. But essentially, like if people are making actual Mario's, okay, I think that Nintendo should be able to go after that. But making Mario style SPEAKER_05: video games? No, I don't think they should be. Try to inform the audience of where we think this is SPEAKER_02: going the prediction for what happens in the long term here with this case, and then how it affects the wider industry. So it does seem the number one possibility in all cases of a copyright claim is settlement. So I guess that would be one possibility settlement, then there's go to the mat and take many years and then get a judgment, right? That's the second possibility here. So and SPEAKER_02: then I guess there's the courts throwing this out or dismissing it right or something. So are those the three buckets we should be looking at here? Like either New York Times wins or loses, or settlement happens? Those are the three possibilities? Broadly speaking, I mean, SPEAKER_05: winning and losing is like, you know, even in some of the famous cases, there's a process where it gets remanded, so sent back. So some of these things for the infringement that or you know, if it is infringement, the things that have already happened, your exhibit j's type of examples, you know, the New York Times will preserve claims on that. But as open AI makes changes, I think you're right, I think it's either the case gets settled. And one way that could happen is, you know, they announced some kind of copyright holder symposium or something. And New York Times is like the head of this consortium. And it's like, some kind of opt in system where, you know, New York Times content is part of it, and publishers can go there, and maybe they get a little royalty, you know, something accelerated similar to what the music industry has, where SPEAKER_05: there's a very sophisticated thing where you need to get the mechanicals and you need to get the performance rights. And it's, it's, it's, there's like a known system, and libraries for that, SPEAKER_02: we'd call that a marketplace solution emerges. Exactly. So like, one possibility is like a SPEAKER_05: marketplace solution. Another possibility is, you know, the court comes out with a ruling that says, LLM's, just the fact of developing an LLM is not copyright infringement, provided you have some kind of substantial protections. And it could come out with a test that says, okay, if somebody asks for Luigi or Moana, you know, anything that's like very obvious, that should be fixed. And there should be measures taken to address that. Another possibility, which we wasn't on the table is congressional action. That's very possible. So we actually have that SPEAKER_05: has happened where courts have, I'm sorry, Congress has codified things I mentioned, fair use in the 1976 Copyright Act. With the internet, we have the DMCA. And it's a very SPEAKER_05: robust system, right? Somebody asks you to take something down, you know, they can contest it. So an amendment to the DMCA also possible. There's a California senator, who he was a CS major, super cool. So California, I think senator, or US senator from California, who is proposing SPEAKER_05: AI regulation, and that could be a possibility. So that would mean whoever gives the most money SPEAKER_02: to be a bit cynical here, whoever gives the most money to their senators, congressmen, whatever politicians, and has the most influence in the deepest pockets for these old people. The jury, accuracy, Jerry accuracy, or something that run Washington, you know, that would kind of feel like it would be in favor of the copyright holders, because copyright holders in the United SPEAKER_02: States, we really do protect them in a major way. So they could just say, Listen, you got to get permission, full stop. Yeah, that's possible. I do think, you know, like I said, opening has been SPEAKER_05: very savvy and thoughtful in a lot of ways. And you know, part of Sam Altman's sort of charm of charm for last year all was on this, like they know who he is. And he's like, Look, I'm not Zach. And I think he was very successful in showing that he wasn't that he's not Zach. And so SPEAKER_05: you know, that's another possibility. I think, you know, the Europe Europe did regulate AI, SPEAKER_05: and they were very proud of that, so far has not been regulated here. But I think there will be a situation and this is Google made a bunch of good law on sponsored search. So initially, if you search for a term on Google, if you search for, you know, acne, acne's competitor can buy that or if you search for Ford, you can get a Chevy ad. And that was actually it wasn't clear that that wasn't trademark infringement. Google spent about 10 years litigating that issue and just won. And I've seen legal theorists make the case that part of the reason Google won is that judges love SPEAKER_00: SPEAKER_05: Google because it's so useful. So I think similarly for chat GPT, it's so useful for us lawyers and SPEAKER_05: people in particular, because we we are stock and trade as words that either I don't think that opening I will lose here. I actually I disagree with you. I think that opening I will win some key points. Yeah, they'll probably have to make some concessions, I'll have to have a copyright here in center, and they'll have to, you know, have DMCA style things, but I don't think they're going to straight lose on the fair use or at least not without going all the way to Supreme Court. SPEAKER_02: And I think that's the reason why I'm taking the other side of that I think it's going to be, we're going to come down in favor of for people who have at scale, copyright libraries, you're going to need their permission ahead of time. And if you've used it, I think you're going to have to unwind it, which is what I agree with you that they're probably in the process of doing that. Sam's pretty smart. And I think it just it's easier to just be like, you know what, we took it out, we redid it. It's no big deal. SPEAKER_05: They could do that. There's been some memes on that on the, you know, the like square jaw meme. The guy's like, fine. Sure. SPEAKER_02: That could be possible. SPEAKER_02: Yeah, I think that's a that's a distinct possibility. I do think you bring up a really good point, which is having seen up close and personal what happened with Uber, and also Airbnb, I was an investor in that one, unfortunately, because people loved the service so much and became addicted to it. By the time the lawsuit started to pile up, when Austin got rid of the city of Austin got rid of Uber and Lyft at one point, people went nuts. And then when he campaigns, I mean, it was amazing. Yeah. And their public policy. It was incredible, like doing SPEAKER_05: that. The one distinction I would make so Uber is a great example where like, I even tell I advise my my clients, like product market fit is an incredible drug, right? It really makes your lawsuits better. And honestly, Uber every time they had a lawsuit, their usage just went up. It's like, also for Uber. Yeah, this one was slightly different is Uber was like in the SPEAKER_05: trenches city by city versus this is federal. So you run into some of the the gerontocracy or whatever issues they want to Airbnb the same thing, you know, people were like, Well, I want SPEAKER_02: to have choices of where to stay. And I want to be able to monetize my home or my second home or my guest house. And it just felt like those companies were on the right side of history, vis a vis consumer choice, lower prices, etc. And I think that's what chat GPT really has going for them, which is, we all want to be able to make Luigi characters and make a birthday card for you know, our family or make a party invite that has the Silver Surfer and Marvel characters on it. SPEAKER_02: So if that's the case, we're kind of like, well, that's kind of the world we want is where we get to use your copyrights without your permission. It totally is. But I mean, this really gets like, SPEAKER_05: it goes back like kind of wayback machine, like, remember, as a Yahoo, and Yahoo was trying to launch like a subscription or a paid e card service. And I was like, I'm not gonna pay for that I can get that free on Blue Mountain. And it just happened to be that ads is what emerged. But one of the things that Europeans point out, and I think a lot of thinkers point out is that ad supported tech is not necessarily how it had to be, you know, could have could have been another SPEAKER_05: way. And so you know, similarly here, like it could be a subscription, it could be a licensing, like, there's many different ways. And it'll be interesting to see, like, eventually the law catches up. It just like you said, takes time. I think now this really very clarifying to have SPEAKER_02: you on the program, because the market based solution is the likely case here. So I think that's where I come to after an hour with you. Market based solution, always the best solution, parties get around a table and hash it out. And then there's, of course, some liability for the mistakes that open AI made. So they pay a speeding ticket, they give them $100 million as part of this new thing. No harm, no foul, they can afford it, they got 10 billion laying around, it's all good. But I like the market based solution. And I think there's something very interesting in how the cable TV system worked, or how bundling and subscriptions work now and authentication. Because chat CPT knows how to do that Sam Waltman, the team over there know how to do that, like they already have API keys. So the New York Times description is like an API key to unlock some things in the New York Times, right? It could be like a really cool feature, like maybe the open open AI markets to people. Hey, you if you have a New York Times subscription, this is going to get a lot better for you. Because when you ask your queries, it's going to give you a bunch of stuff and say and also from the New York Times for further reading, boom, boom, boom, boom. And would you like us to bookmark this world of possibilities of how opening I could work with, you know, New York Times to make interesting stuff, vis a vis recipes, hey, you I want to ask it about recipes. Here's the fact I took a picture of my refrigerator, and then it went to the New York Times food app and told me possibilities of what I can make based on my spice drops. That's not interesting. It's amazing. And that was one of the things you know, I mentioned that I SPEAKER_05: spent, you know, three and a half years at Amazon. And that's kind of how Bezos things and it was woven into everything where it's like, okay, how can we have this tech work together? How can we get paid for one piece of content multiple times? How can we turn something into self serve? Like, you know, I loved Mark and Jason's a IPS. Obviously, he's super in the AI optimism phase. But I'm very optimistic to for all these like, daily life fun use cases. So yeah, good stuff. SPEAKER_02: As career technologists, you and I, it's pretty clear that this is the one, this is the chosen one. This technology is the manifestation of everything that's come before it from the PC revolution to the internet, to mobile and cloud and all this and then big data. All of this is built up to this moment in time. And so it's really important we get it right. Chichilia, you are amazing where people find more of you. Yeah, so you can follow me on Twitter, SPEAKER_05: it's Chichilia's in or I pretty active on LinkedIn as well. I'm launching my own startup. AI for lawyers. Yeah. Oh, I know. I know an angel investor. Yeah, he's really good at getting you SPEAKER_02: your first hundred customers. Amazing. Yeah, no, it's like, does it have a name yet? Or? SPEAKER_05: Yeah, it's gonna be General Counsel AI. So that's who I am. And I thought, okay, GCAI. But love it, SPEAKER_02: essentially, still sort of, I guess, stealth, because we're developing the product. But I have SPEAKER_05: an engineering co founder, and we're pretty well, well there. But to the point that we talked about, like, LLM's are wordsmiths, right? So what are lawyers also wordsmiths? So I've had, you know, when you said that this is the chosen technology, I had that feeling very strongly. I've never been excited about legal tech before. It's like, you know, CLM snooze. But this was actually like management. Exactly, you know, but like, this is something where to the C&D point, I could write a cease and desist letter. And I just say, like, here's the here's the infringement, here's the whatever and make very light edits and a one hour task becomes a five minute task, not even. And I give training classes for lawyers, you can find me on Maven. I know you, you're friends with Gagan and them too. So I teach on Maven because I have so much energy that I gotta get out. Right. So SPEAKER_02: SPEAKER_02: fantastic. Everybody check your Maven and we'll put some links in the show notes. You are awesome. Please come back. We should do a check in when we let's do it. This is so fun. Okay, SPEAKER_05: happy to do it. Have a good one, Jason. Thanks. All right. And we'll see you all next time on This SPEAKER_02: Week in Startless. Bye bye