SPEAKERS Professor Chris Hoofnagle, Eric Ahern, Isabel Jones
Podcast Transcript:
[Eric] 00:12
Welcome to the BTLJ podcast. I’m your host, Eric Ahern.
[Isabel] 00:17
And I’m your host, Isabel Jones. In today’s episode, we’ll be diving into the fascinating world of one of the most advanced machine learning tools out there: ChatGPT. ChatGPT is a natural language processing tool developed by the company OpenAI in December 2022. Just a month after it’s release to the public, the tool reached over 100 million active users.
[Eric] 00:42
However, the world’s fascination with ChatGPT isn’t surprising. ChatGPT responds to users’ prompts in a conversational way and has the ability to engage in dialogues, admit to errors in previous responses, and deny or dismiss users’ inappropriate demands.
[Isabel] 00:59
You can use it to write essays or songs, draft an email to your client, or even respond to someone on a dating app. It even wrote me a funny Limerick about the Supreme Court:
“There once were nine judges supreme
whose robes were a legal dream.
But when they ruled five to four,
they’d leave you wanting more,
and you’d wonder if they forgot to scheme.”
[Eric] 01:19
Despite all chat GPT is exciting uses. It also has some serious limitations. For example, chat GPT is not always right. It can provide users with “plausible sounding but incorrect or nonsensical answers.” It also raises ethical issues about exacerbating systemic inequities and biases and spreading misinformation.
[Isabel] 01:40
Join us as we sit down with today’s expert guest, Berkeley Law Professor Chris Hoofnagle, to discuss ChatGPT. We’ll explore the potential impacts of ChatGPT not only on everyday life, but also on the legal industry, education, intellectual property law, geopolitics, and more.
[Eric] 02:00
This is an episode you won’t want to miss.
[Eric] 02:34
Welcome, Chris. Thank you so much for sitting down with Isabel and I today to discuss ChatGPT. We are really excited to have you on the show and to engage in this conversation with you.
[Professor Hoofnagle] 02:45
Thank you for having me.
[Eric] 02:48
Absolutely. Thanks for being here. So just to start things off, we’re hoping we can go over the foundational background of what ChatGPT is, why it’s important and why it’s causing such a stir in the news these days.
[Professor Hoofnagle] 03:03
ChatGPT is the newest iteration of a machine learning technology that can generate text. So this is one of a class of technologies that can create things. There are some several related technologies. For instance, there are machine learning technologies that can create images that are original. ChatGPT creates text that has aspects of originality, and can be quite compelling.
[Eric] 03:35
Great, thank you for providing that quick summary. My next question is: what would you say that the purpose of ChatGPT is and other similar machine learning language models?
[Professor Hoofnagle] 03:45
There’s going to be a wealth of applications of this type of technology. You can imagine any business process that requires a lot of human iteration, such as chat, like customer service, and so on. ChatGPT will be perfect for answering probably most questions that a consumer has about your product. So there’ll be a fantastic number of interfaces where a computer can answer your question, rather than bothering with a person. And it can be everything from customer service, to how do I get from here to there to I need advice on so on. There’ll be just millions and millions of different ways that the technology could interact with people and provide information that is useful. We’ve seen that ChatGPT is the fastest growing consumer application in history. Just in the past few months since it’s launched, it’s already gained over 100 million monthly active users which is quite an incredible accomplishment, of course. In your opinion, what would you say is drawing people to the application in droves like this? I think people perceive a Google moment with ChatGPT. Now, if you’re as old as I am, you’ll remember what it was like to search the internet in the 1990s. And it was just awful. Most of the results were just spammy advertising and so on. And then Google came along and it’s clean interface and the algorithm it used provided this wonderful clarity, and you really felt like you were getting the best results. Since then, search has obviously declined. And many of the things we are looking for are botched up with spam and wordy explanations. It’s all about delivering more and more ads, and so on. So ChatGPT I think is a Google moment. I think people are looking at it and saying, “This could be the next Google. This could change everything. I’ll no longer be getting a bunch of junk responses. Instead, I’m gonna get an answer to the things I’m interested in.” And it sounds trustworthy. So it’s not just kind of the vision of, hey, this is an answer instead of a search. It’s an answer that sounds authoritative.
[Eric] 06:13
Okay, well, speaking of search engines, we’ve seen that Microsoft has begun integrating ChatGPT into their new version of Bing, and Google will be rolling out their version, Bard, sometime soon. Do you think that search engines as we know them are an endangered species?
[Professor Hoofnagle] 06:33
Absolutely. I think, you know, Eric Schmidt said years ago that people don’t actually want search engines, they want something else, they want an answer engine. And the way we get the answers is by doing these searches, and then we look at the results. And sometimes the results are good. And sometimes they’re not and so on. But most of us want to know, is my plane on time? What temperature should my oven be, and so on. And these forms of chat responses look much more like answers that people are going to want to the various things they’re curious about. So if done well, if well curated, and if the quality is there, it’ll be so much easier to say, well, who won the Super Bowl in 2002 instead of going to some weird website with lots of stuff popping up and whatnot. Maybe this responds and just gives you the answer. But there’ll be a lot of nuance and trick in that as well. This technology is obviously not going to always do that well.
[Isabel] 07:38
Going off what you said about people looking for answers instead of a search engine, how do you think AI models like ChatGPT will change the way we interact with technology and with each other?
[Professor Hoofnagle] 07:50
Well, just like with search, we speak to our computer in a certain way, so that it will understand us when we make a search. And the same is true when we speak to the various devices out there like the Amazon Echo. Essentially, the device trains us to speak in a certain way. And I think that will happen, too, with ChatGPT. It will train us to ask questions in a certain way. So that we get the answers we want out of the system. But I think that the changes could be quite profound. Every business is going to want something like ChatGPT. And it’s not just for interacting with customers in a one and one way. When problems come up. It’s also going to be if you look under the hood at OpenAI’s other algorithms, you’ll see that they provide all sorts of utility for businesses and decisionmakers involving summarization. So, for instance, I just read all my teaching reviews, it takes you know, it’s like 100 pages long. Well, OpenAI offers a programming interface that lets you upload a text, and it will just write a summary of it. So you can imagine every business is going to think about, “Wouldn’t it be nice if I could just summarize my customer reviews? Wouldn’t it be nice if I could use a system like this to filter out product reviews, find the good ones, trash the bad ones, consolidate?” So that you know you’re not reading 1000 reviews, you’re only reading the ones that contribute something new. So the meaning making. So on one hand computing is informed greatly by sensing. And now we’re at this moment where computers are making more sense. So it’s not just sensing it’s sense-making of the world that’s going to be really compelling with ChatGPT derivatives.
[Isabel] 09:56
So with this sense making capacity that these technologies have, what are some of the dangers or limitations of sensemaking? Especially considering I know, I don’t know if you saw the news of Google’s Bard that came out, and there is an inaccuracy in the ad already. So what are some of the dangers when AI is wrong? And how often does this happen?
[Professor Hoofnagle] 10:21
ChatGPT is a machine learning technology. You’ll notice that I tend not to talk about artificial intelligence. I talk about machine learning, because what’s happening under the hood here is machine learning technologies are statistical technologies and those statistics have to come from somewhere. So where are the statistics in ChatGPT? Well, for large language model systems, what’s happened is that companies have applied statistical analysis to massive compendium of text. And that text might be internet text, like literally comments on Reddit, or Wikipedia articles. Or it could be books. It could be the, you know, Google has a real ability to look at most of the books in the world, through the HathiTrust, and so on. So many of the risks come to the underlying training data set and the color of that underlying data. And anything you train on internet data, is going to skew in a certain way. And as a result, the type of information you get out of a ChatGPT or other ML model is going to have some of the pathologies of the internet. So what are those pathologies? The internet as a political valence, it’s mostly used by more affluent people, the original users of internet are pretty literate, and so on, all these things are going to bias what ChatGPT produces. So there will definitely be a politics underlying ChatGPT responses, and that’s going to be the kind of politics of computer users, mostly men, mostly kind of libertarian-ish types, and so on. And I think we’re gonna see, eventually people will show that view.
[Isabel] 12:30
And I’m curious if you can maybe explain the difference between machine learning and AI for our listeners a little bit more, just so they can get a gist of how ChatGPT actually functions?
[Professor Hoofnagle] 12:42
Well, I find AI, artificial intelligence, as a confusing term because it means vastly different things to different people. And I see people using AI to define to describe technologies like calculators. So AI that is general would look like the types of technologies we see in Star Trek, something like Commander Data. A device, in the case of Commander Data, a robot, that can essentially do everything better than people. He’s faster than people. He can read faster than you. He can do math better than you. Everything, he can do better than you. That is a very elusive goal. The Commander Data of the world, it’s a very elusive goal. So instead, what has been invented are a series of what people refer to as narrow AIs. And this is where the calculator becomes an AI. The calculator is an AI because the calculator can do math better than any human. So in its narrow sense of doing math, a calculator is an artificial intelligence. I refer to machine learning technologies because it helps us see that the underlying what’s happening under the hood is statistical analysis. And that statistical analysis always fits a certain type of problem. There’s no kind of universal tool when we talk about machine learning. There’s the hammer, there’s the screwdriver, and so on. All these different statistical approaches to solving problems. And some of them are better than others. And some of them do things differently than others. So we don’t have anything. I don’t use the term AI in part because we don’t have anything that’s generalist, we have all these narrow tools, some of which outperform humans, in some circumstances.
[Isabel] 14:36
Yeah, I think that’s a really helpful way of thinking about it too, especially bringing up statistics. And I want to dive a little deeper into what you said earlier about how machine learning can end up reflecting a skewed portion of the population or whoever ends up using the internet more often. And I think two examples from The Daily, which is a podcast from The New York Times, kind of stand out to me that a technologist came on to talk about.[1] And the first one was he asked a question about what caused the Civil War. And I think most historians and myself included, would say slavery. However, if ChatGPT were to look at more conservative media and that data was more populated. That answer might change. And then a second type of bias, I think that was reflected in the machine learning was somebody asked ChatGPT to write a love story. And ChatGPT came up with a story about a heterosexual couple named Jack and Jill, which reflects heteronormative narratives about love.[2] So I guess my question to you is, how do we ensure that these tools that use machine learning do not end up exacerbating systemic inequities and biases in our society? And is there anything we can do about that?
[Professor Hoofnagle] 16:09
Well, to continue on your example, surrounding the Civil War, part of what’s happening there is that people who write with certainty, and with declaration are going to cover the underground, the underlying statistics more. So those who believe, for instance, that the Civil War was about states’ rights, are likely to say that in a very direct way. Whereas historians who are looking at all the different factors, some of which might be state autonomy, but most of which surrounded the economics of slavery might say something, they might make claims that are more qualified, and more nuanced. So I think just from a baseline, the people out there who are making strong claims and certain claims are going to speak more loudly, in the statistical models of ChatGPT and these other environments. So I’m sorry, this is a really long answer. But when I learned computing in middle school, there was a sign on the wall, and it said, “garbage in equals garbage out.” And we still can’t learn this. And for many companies out there, they think all we got to do is scrape up all the data and the bad data doesn’t matter because truth will shine through or something like that. That’s just not right. There are situations where incorrect data helps you recognize correct data. So, I can show you examples of that. For instance, in doing investigations, knowing that data are false can help you understand what data are true. But these are situations where there is a human analyst who is saying, because I know so and so is false, we can suppose so and so is true. Or because we know so and so changed, we know this other fact. That is very highly tailored analysis. And it’s the type of thing that happens in investigation, not in large statistical models that are trying to guess literally what word is likely to follow another word, and so on. I’m sorry that that’s a little confusing. But my point here is that people who speak with certainty, including those people who are speaking strategically, to convince others of a truth are going to speak more loudly in these systems. And we’re not going to be able to refine all the trash out of these very large models. I suspect that the best models are going to emerge from sets that are trained on higher quality data. And so and there therein lies a problem at the high quality data. Where are those high quality data? They tend to be locked up in books, which are protected by copyright, and so on. But the data that are available to us are all those data on, you know, the web with all its warts.
[Isabel] 19:35
Kind of going off your premise of trash in trash out, which I’ve also heard about in a lot of my data classes, I’m curious to know, with fake news and a lot of manipulation of information that is readily available, as you discussed, does that make machine learning readily available to be manipulated for nefarious reasons?
[Professor Hoofnagle] 20:04
Yeah, absolutely. And a lot of what’s happening in the Department of Defense right now is a type of study known as adversarial machine learning, where one deals with an enemy who has machine learning by poisoning something. It could be poisoning their underlying dataset, or poisoning their model, such that their systems cannot function when they’re called upon to operate. So it might be feeding it images, so the adversary system cannot recognize a gun, or so that they mistake a gun for something that is not threatening. So there’s a huge amount of research looking exactly at this. And the focus is can I poison your data? Can I poison your model and make you believe things that are wrong? So absolutely. You know, if the inputs include crazy talk from disinformation people, that’s going to ultimately get get reflected in ChatGPT. I’ll just give an example. I don’t mean to pick on Microsoft because I think they did the right thing in doing this. But Microsoft years ago, released a chatbot called Tay. And it didn’t take long for people on the internet to start telling Tay racist things. Tay started to believe it learned these things. And then it started denying things like the Holocaust. And so that garbage in eventually becomes garbage out. And we’re going to need people to think about what we’re putting into these models. If we don’t want to have a ChatGPT that ultimately concludes the Holocaust didn’t happen, and the like.
[Eric] 21:52
So what’s preventing the current iteration of ChatGPT from going down the same path that Tay did? Is the training data just have a higher quality now than what was given to Tay?
[Professor Hoofnagle] 22:04
Well for one, Tay learned from interactions with people. So so people told Tay things that were anti-Semitic. And then Tay began to learn these things. I don’t know what ultimately trained ChatGPT and we’re at ChatGPT 3.5. There’ll be newer versions. There’ll be more modern versions coming out. But it’s clear that the people that OpenAI, did some front end filtering on the questions you ask it. You know, it’s not perfect. You can’t kind of govern every edge case. But if you give ChatGPT an offensive question, it tends to respond by saying, “Listen, I’m just not going to help you with that.” So I think they realize that any tool it OpenAI realized that any tool like this, you put on the internet, there are going to be trolls who get it to say things that are awful. And so they definitely have a front end approach to that, as in filtering out requests that are obnoxious. They presumably they have some back end work on it, too. But I don’t know what that is.
[Isabel] 23:13
What are some steps you think that companies or regulators could take to ensure responsible use and deployment of machine learning models like Chat GPT?
[Professor Hoofnagle] 23:24
How companies respond is going to matter a great deal about the use cases and where they sit in the ecosystem. You can bet that Microsoft and Google are going to do a lot to prevent nefarious uses of the technology because their brand is going to depend on it. I’d worry more about situations where your brand doesn’t depend. An example might be technologies that are used for filtering people who are applying to your company, where the loop the feedback loop, there might be entirely opaque. So so and so gets an interview, but so and so doesn’t. That fact is opaque. And if even if it comes to light, you know, how is it going to stick to your to your brand. So a lot of it I think deals with incentives. A lot of it also deals with risk. What we’ve done in the United States is we’ve generally not allowed people to use automated decisionmaking for important questions, the most important questions in their lives. So for instance, we’re not allowed to use, in America we’re not allowed to use automated computer matching and analysis to deny people’s benefits. If, for instance, the government thinks someone is cheating on their benefits, that person gets a hearing. So ultimately, there are human eyes on on the situation. And there’s a process. Now the question is, well, what’s the benefit of automation if I can’t actually automate and if I can’t actually act in the world, if I have to do human review of all these decisions, why even have this? And that’s, that’s where we’re going to have to make decisions about the types of threshold issues where people should get human review because the issue is just too important. And here, I mean, I could I could go on forever, but people’s ideas about what is important enough for human review really are changing. So we don’t you don’t get any human review, when, for instance, you’re denied a credit card. That happens entirely automatically. Now, on the back end, you can kind of you can have an appeal and ask someone to reevaluate your credit score, and, and so on. But the, you know, front to back end, I mean, the whole process, we grant people credit cards or deny them with an entirely automated process. We do the same to evaluate credit scores and to issue credit reports. And I think as a society, we have to decide when do we want to interrupt this process and give the individual some ability to interrogate or challenge? And when is it just not important? I think there’s huge cultural issues here. What I’m sensing from Europeans is that everything is important. And there has to be a kind of rights regime for the most minor question, even like advertising. Like if I see an ad that’s different than you, the European Human Rights regime wants there to be some type of mechanism for explanation. But most Americans, I think, would say, you’ve got a different ad than I did. Do we really want to subject that to a rights review? Well, in addition to the credit card application process, I was just thinking of other application processes in our society, where it would be really unsettling to know that an algorithm was sort of calling the shots behind the scenes, you know, for example, the college application process, applying for a job. It is pretty wild to think that an algorithm could be determining, you know, these really monumental decisions in people’s lives. Yeah, absolutely. And so the way the US is focused on is background checks. So where do you have rights to even to challenge even the private sector, not not the government, the private sector using systems to make decisions about you, they tend to be around the use of a credit report. So if you’re denied tenancy, like, you know, you can’t get a rental apartment, denied a job, denied a bank account, and so on. That’s where we have said people should have some rights, they should be able to correct they should be able to question what’s happening. But as computer decisionmaking begins to dominate more and more parts of our lives, we’re gonna have to make decisions about where it matters, and where we’re willing, for there to be errors, and unfairness.
[Isabel] 28:02
So do you think ChatGPT could ever replace lawyers when we’re talking about technology replacing human interactions? And I think a poignant example of this is on February 22 of this year, Joshua Browder, who is the CEO of Do Not Pay was planning to use ChatGPT to defend a client in traffic court, here in California. It would have been the first ever AI-powered legal defense where the defendant would wear special glasses with recording and dictation capabilities that connected to AI text generators. And these machine learning text generators would dictate legal responses to court questions in the clients ear. While Browder backed off the idea after a State Bar threatened to refer him to the district attorney’s office for criminal prosecution, do you think this is a future possibility? And if so, what implications might this have on the legal industry? Or even making legal services more accessible and affordable?
[Professor Hoofnagle] 29:07
So that’s a great question. And one of the reasons why I teach Python programming to law students, is that I think there’s going to be a revolution in your career timeline, maybe not mine, where computers are used much more or computing is used much more to perform legal tasks. Does it mean an end to lawyers? I don’t think so. But what it could mean is efficiencies, new ways of asking questions, new ways of searching for information, that we currently do very poorly. I mean, if you think about the different companies out there, the different law firms out there that have basically warehouses full of lawyers doing document review, cases that involved millions of documents, millions of emails that we’re supposed to put eyes on, it’s just obvious that all that data is going to move into a system like OpenAI’s, which could look for, at least do the first review of looking for attorney client information, or looking for information that’s probative or exculpatory, or whatever is necessary. We could also see systems like ChatGPT being used as the first draft, the first draft of this, that or the other, a lot of what we do as lawyers is not, you know, actually litigating. It’s just sending people letters, and communicating with people. And ChatGPT does a nice job with a, you know, short to the point letter. And so maybe becomes the first a first draft. I still have an I’m still optimistic about lawyering as a profession because it’s still the case that it is our job to say, my case doesn’t fit the precedent, my case is special. And therefore, you’re not going to apply this rule I don’t like you’re going to apply a different rule, or you’re going to apply a new rule that I like. And that can only be done by a person. And it’s going to that involves human persuasion. And it fundamentally involves the liberal sense of justice, that different situations demand different law. So deterministic systems will not be able to accommodate that. So the guy who’s doing the parking ticket or the speeding ticket, he might be operating in a deterministic system, where there’s there’s law, there’s rules that are pretty much well-founded. And that makes sense. But we all live in a world that where there’s much more chaos. And there’s new technologies and new contexts all the time, that give us opportunities for some lawyers to say, “No, my client situation is different.”
[Isabel] 31:54
I totally agree. And I think that’s another reason why we have judicial discretion because we realize that sometimes people are deserving of different outcomes for things that are beyond what the black letter law holds. Which is why I think there’s also when you’re thinking of criminal law, there’s sentencing guidelines. So it’s within a range just so you can adjust for people’s different circumstances. Kind of going off of the legal industry and the changes there. I also want to turn to legal education and how ChatGPT can be exciting, but also a little frightening in its potential to be used in the classroom. So I want to start with the potential benefits. And I’m curious as a professor yourself, how might professors use ChatGPT to improve classroom instruction?
[Professor Hoofnagle] 32:48
Well, one of the prime benefits of ChatGPT is it’s really good at developing questions. I used it recently I had to interview a government official. And just for fun, I typed in the government officials title, and I said develop, you know, interview questions for this official. It developed eight questions. I used like three of them. Okay, so and they’re milquetoast questions, there were not very insightful questions. But you know, in most interviews, you start out with some softballs and ChatGPT saves me the minutes of my life. It took me to write those softballs. So I think one is question generation. The other might actually be de-biasing. So the way I think about a case might be heavily biased based on my training, and so on, we could imagine using a system like GPT, to develop questions about material from different frameworks or from different assumptions. So I think there are tremendous uses in the classroom. I think there’s pedagogical uses. I mean, so for instance, I haven’t done this yet. But I anticipate there’ll be a day where, I teach torts, where I put a torts question into ChatGPT and show the students the answer that comes out of ChatGPT. And then we can get together and talk about it. What’s good about this answer? What’s bad about this answer? How do you add value as a lawyer to this answer, and so on. And so I think it has tremendous pedagogical purposes. Sure, some people are going to cheat. But those people will probably never learned the skills they need to be a good lawyer.
[Eric] 34:34
I wonder if there are different legal subjects or areas of the law where you think ChatGPT would be more useful or helpful or skillful. Maybe it’s the right word. You know, for example, at the beginning of the semester in my intellectual property class on one of the first days of the course, we spent the entire hour and 10 minute period going through a problem from the textbook that involved classic copyright issues, some patent issues. You know, we’re supposed to advise this theoretical small business owner on how she should get a business off the ground and avoid any sort of infringement. And we came up with a pretty good answer between the 35 to 40 law students in the class, but at the beginning of the next class, our professor typed the same question in the ChatGPT that we all answered. And within, I don’t know, 45 seconds, however long it took for the chat bot to produce an answer. It gave a response that was pretty similar to the one that we came up with, between all of our collective minds working together. So that was, admittedly a little disheartening. But, you know, on the other end of the spectrum, when I was meeting with my civil procedure professor that I had last semester, a few weeks ago, we were talking about ChatGPT. And he told me that he was playing around with it, and wanted to see how it would do answering a part of our final exam. So he gave it the policy portion of the final exam. And he said, ChatGPT did terribly on the test. So that was, I guess that was a win for humans there. You know, the robot can’t do everything. But, you know, my question is, would you say that torts is a topic where ChatGPT, you know, could potentially excel? Or do you think there are any subjects of the law, where you think that ChatGPT would be especially proficient?
[Professor Hoofnagle] 36:34
So this relates back to earlier questions about what ChatGPT and related technologies are likely to do and what they’re not likely to do. They’re likely to recognize existing patterns. They’re likely to understand deterministic processes. So if you have something like The Copyright Act, where you can say if the work was created before this date, or after this date, and was registered or not registered, there is a series of deterministic decisions that can be made. Similarly, in torts, the fact that it is evolved through the common law through hundreds of years, gives a model more ability to learn it, than maybe other topics. And so there’s an interesting professor at Chicago Kent who did a study where he took the questions from the Multistate Bar Exam and put them into chat GPT. And he concludes that it gets a 50%. Interestingly, it does better in evidence, which I would argue might be more deterministic, and torts. So yeah, these basics can be, you know, those basic questions are relatively easy to answer. And, frankly, you’re not going to make a lot of money as a lawyer answering the easy questions. The way you’re going to add value is by dealing with things that are new, and that there is no record of the new legal risks, the opponents that might show up who didn’t exist before and so on, they’re not going to be in the model. So there’s no way to predict them.
[Isabel] 38:17
I think that’s a really interesting point, too, because I saw a similar experiment performed by a professor at the Minnesota University Law School, who ran one of his tests through ChatGPT.[3] And he said that it did horribly on the issue spotter questions, which, for people who are not in law school, is a fact pattern where you have to find the different legal issues and respond to them. And that’s kind of one of the bread and butter, I think of a law school test, especially in a doctrinal class. So I’m curious if you had other examples of things that law students or lawyers are better suited to address, than a ChatGPT type of technology?
[Professor Hoofnagle] 39:08
Well, especially around risk lawyering, that is advising clients in situations where someone might have a claim against you, or you might have a client who is engaging in something that could attract a suit, whether they’re actually going to attract a lawsuit or not. So for instance, in the area where I practice, which is startups and venture capital, there’s a very strong bias against litigating. And in like my little field, when people litigate, it’s because they’ve got nothing left to lose. And basically, it’s a signal they’re a loser. Okay? Like you, you couldn’t invent the technology, you couldn’t bring it to market and so on. So all you’ve got is a lawsuit. And so there’s a strong bias towards the idea that what you really should do is just start a new company and get a new idea. But what this means is that for any client in my universe, all conversations relate to risk. And these are deep game theory, explorations of who your competitors are, what regulators are concerned about how people might misunderstand your product, and injure themselves with it or not. These factors are quite complex, and they’re not going to be in any statistical model. There’s also just a lot of learning going on in law firms, where there are events, as in, people are injured, news articles come out, legal decisions come out, regulators, make speeches, and so on. And you have to use your brain to say, hey, wait a minute, there’s something new here that my client hasn’t ever experienced before. We need to think about it, right, the client alert and so on. ChatGPT could help you write that client alert. But that kind of signaling is not something that computers are going to tease out.
[Eric] 41:18
We’d like to transition now to a quick conversation about the intersection of AI or machine learning and copyright law. So if you wouldn’t mind, I’m hoping you can give us a brief overview of where we are at now with this, where we’re headed and what the main concerns are with generative ML (machine learning) models that produce output and whether or not that output could be copyrighted, would you be able to just give your general thoughts, you know, concerns about ChatGPT and how it relates to copyright?
[Professor Hoofnagle] 41:48
I think your question raises this basic problem of human creativity. And the purpose of copyright laws is to give people incentives to create, to do great things, to invent products, and to write books and texts that we all love and enjoy. And so we have to make a kind of fundamental determination of whether these technologies or how these technologies complement or compete against those interests. So you can imagine generative technologies being used by an artist to help develop concepts that then the artist executes. So you know, you make a painting, and you take a photo of it and put it in your computer and create 1000 variations of the painting, which you ultimately paint. You could imagine people using ChatGPT to improve their writing, to help them articulate things that maybe they couldn’t articulate. And these all would seem to be uses of technology that would be compatible with copyrights’s goals of increasing creativity. But we could just as well see a use of the technology to get rid of all the photographers in the world, to get rid of all the artists in the world, and evenall the writers, whether they’re novel writers or comedy writers. Essentially to substitute for them. And the worry I see here, the concern I have here is that substitutes don’t have to be great if they’re cheap. If, you know, if I could have an original piece of art on my wall from let’s say, Richard Serra, and that would cost me $250,000. If I could get something that kind of was Serra-esque that came out of my computer, and I and but wasn’t great, but looked okay, I’d be willing to pay $250 for it, or $3. And so I see a kind of world of mediocre substitution, that could be great. And I mean, all you have to do is look at the music world to see this. And the kind of the fundamental look, the mass culture of music world all the music kind of sounds the same. This is partly a problem of computer generation and computer optimizing of artistic creativity. Yeah, it definitely seems like the main questions that have been circulating in the discussion is whether a machine learning model or an algorithm can obtain a copyright. You know whether a person who inputs data or prompts into a model can copyright the output that is produced from their input. I’m wondering what your thoughts are on this. Well my colleague Professor Samuelson has been writing about this issue since the 1980s. She wrote an article I believe it was in 1985, exploring whether a computer could hold a copyright. Now you could imagine a situation where one uses a generative technology to create an idea and then the human executes that idea. That clearly would be in protection. But then the other end of the spectrum is I just let my machine learning model loose, generate me a million images and put them online. We can see how that involves no human authorship, and thus would fall out of copyright. My understanding of the law now is that we would need a statutory change to give things the ability to create copyright. That it is a a power reserved to people at the moment. But, we could change that. You can imagine the entire lobbies arising to say that the creative industries of the future are actually these computer models. And for us to make these models, we need to be able to capture the reward the rent, that I can charge with a copyright. Currently, there has to be a human author involved in the final creation.
[Eric] 46:11
Got it. So I guess that also relates to the famous or maybe infamous monkey selfie copyright case. For those of you who aren’t familiar, that was Naruto v. Slater from the Ninth Circuit 2018, where a monkey used the photographer’s camera to take a picture of itself. And then the question was, did the monkey have copyright to that photo? You know, did? Could a nonhuman have intellectual property rights? And the court said, No, that is a right exclusively reserved for humans. So it’s interesting to see the effects of a decision like that trickling down and having weighed in conversations of AI and copyright.
[Professor Hoofnagle] 46:56
Yeah, I think the crisis that humans could experience is a world where computers become better than the average person in expression. ChatGPT 3.5 might already be there. If you think about what we do in the United States is we spend years in high school teaching people how to write a five paragraph essay. And many people cannot rise to that challenge. They get through high school and they go on to their lives. And that’s fine. They’re not called upon to write, as lawyers do, to write as journalists do, and so on. But we as lawyers and law students, I think that we do not realize how weak the writing and literacy skills are of ordinary people. There’s a survey that comes out from the OECD (Organization for Economic Cooperation and Development) that measures literacy worldwide. And it finds that about half of people in the world have weak literacy. So when a ChatGPT comes out, you realize that it could actually outperform most people. And that in turn raises an interesting issue for lawyers. One of the reasons why people hire us is that they are not good at speaking up for themselves. They need someone else to help them. It’s not even a matter of law. It’s a matter of, I need an advocate. Well, what if ChatGPT can write my letter to my landlord complaining about the leaky faucet or some other problem, and so on. So maybe that’s good or bad for lawyers, and maybe it’s good for people. Maybe people will be able to use this tool to advocate better for themselves and not have to pay anyone to do it.
[Eric] 48:48
I’m definitely also drawn to thinking of ChatGPT as a tool. You know, I’ve heard a lot of people draw analogies between ChatGPT and calculators. And I think that’s great. I mean, I am of the position personally, that in educational settings where a lot of teachers definitely seem to be concerned that students are going to use ChatGPT to write essays for them or to do other homework. I think it’d be great if teachers could learn to encourage students to use ChatGPT as they do with calculators, you know, to help them with certain assignments. And obviously, that may mean that we need to sign less take home essays, or just assign essays to be written in class. But as you said, the quality of the output is going to be impacted by the quality of the input. So maybe it’s imperative that we teach students to properly use machine learning models as a tool to produce quality results.
[Professor Hoofnagle] 49:50
Your question reveals an aspect of ChatGPT that’s important to surface and that’s just that ChatGPT has been programmed to write In a way, such that it sounds authoritative, even when it doesn’t know what it’s talking about. Well, it actually doesn’t. It never knows anything about what it’s talking about. But the model of writing in the active voice, is something that chat GPT teaches people. And that is not an easy model, it’s actually very difficult to write in the active voice. It’s, you know, as you read literature, you’ll see that often, you know, what makes great literature and great reading is active voice writing. And so I think we could see ChatGPT as a way of helping people express themselves more forcefully. The kind of underbelly of that is, is that it doesn’t do a good job expressing doubt, uncertainty, and so on. And so that’s an area where ChatGPT might really lead you astray, by telling you something and telling it to you in such an authoritative way that you believe it. And you believe that there’s no nuance.
[Eric] 51:13
So I believe that the website, the ChatGPT, website does inform users that the models answers aren’t going to be always 100% correct. So there’s no guarantee of that whatsoever. But it would seem to me that it’s pretty likely that most people who are using it assume you know that whatever they’re seeing is the correct answer, especially if they don’t have a ton of experience with ChatGPT, or experience with computers in general, is that a legitimate concern for you?
[Professor Hoofnagle] 51:46
Absolutely. And it goes back to literacy. The number of people, so this is tested worldwide, it’s about 14% of the public that can ingest information from diverse sources, that is conflicting information, and make judgments based on it. So there is a deep problem in the world with making sense of information. And there’s nothing that relieves us from the responsibility of actually thinking. So this technology, no technology, takes thinking off the table, you still have to do that. You still have to be critical, whether you’re reading The New York Times or whatever, you still have to use your brain. And these tools might give the illusion that you don’t need to. And many people won’t. So just imagining like what the game theory looks around that. You know, every brand, every product maker is going to want to influence the underlying decisionmaking so that recommendations come out with some the right way. Every dictator is going to want to alter the system so that the patina of information about them is kinder. This is one of these things, it’s going to be like libraries. ChatGPT is going to be something that’s going to have to be like a public trust, in order to be right. It’s gonna be a lot of investment, and a lot of hard work.
[Isabel] 53:30
Going off of that public trust concept, is there anything that the international community is doing to think about a global governance of machine learning technologies like this?
[Professor Hoofnagle] 53:43
Oh, absolutely. The European Union has a AI act that creates substantive and procedural safeguards, some of which are designed to improve the performance and quality of machine learning tools. Again, this is through the lens of the European Human Rights Campaign, which essentially embodies this idea that we have to double down in the age of computing and the age of automated decisionmaking. In an age where, like, increasingly, there’s going to be technologies around us that we can’t understand and that can interfere with our lives, we have to double down on human agency and human rights. That’s the kind of logic of it and so some people are chafing under this idea that we should regulate this technology. But you know, I teach cybersecurity and a lot of the focus there is essentially national security and computing. And if you look at the landscape though, last time I checked, there were almost two dozen, more than two dozen, more than three dozen, fully automated, that is computer controlled systems that can kill people, automated weapon systems. We’ve had them for a long time. We’ve actually had them back to the 1940s. But these are more and more nations want such devices and want such capabilities. So we’re going to be interesting these devices with not just decisions about what ads we see, or what credit card offers we got, or even who you get paired for on a dating website, but also decisions about coercion and the use of force. So that has to be gotten right. And I actually don’t see a way of avoiding it. Because if adversaries adopt the technology, and they are, the Russians, for instance. It means that defenders have to as well. There’s there’s kind of no way out of that dilemma. You have to adopt the technology, if only to understand the adversary. Even if you say I’m you know, “I pledge to not use this technology,” you want to develop it in order to understand the weaknesses and the adversary. And we’re, we’re down that road where there are already systems out there that launch, that autonomously find targets, and autonomously decide to destroy them.
[Isabel] 56:28
It sounds like there are kind of some parallels between machine learning and AI race and what happened with nuclear weapons and how we have kind of mutually decided not to use them as a society. Are there any international agreements in the works besides the EU’s AI act that you’ve seen being discussed or planned?
[Professor Hoofnagle] 56:54
Oh, absolutely. So the U.S. military is very wedded to and I think rightly so, to the involvement of humans in weapons decisionmaking. And it’s a cultural issue. It’s a values issue. It’s it’s not going away. And the details differ, the formal rule is that humans have to be in the loop of decisions to use force. The facts on the ground, however, are that humans are on the loop. And that means that there are systems where the operator can hand over its operation to the machine, and the machine selects targets, and the machine decides to shoot at them. But the human operator oversees it in real time and can turn it off. Those systems already exist. Every ship, every battleship, and the U.S. Navy has such a system. But then the question is, will the rules the DoD have created, will they be tempted further down the road of handing more control over to the device, and even such that there’s no ability to stop it. There’s a lot of really interesting philosophical debates here that anytime, let’s say a weapons use, like a kinetic weapon, like you shoot a missile, things can happen, things can change, and you can’t call it back. You’ve already shot it. It’s going off towards its target. Civilians could enter the battlefield for for instance, after launch, but before before it, it lands. So there is this idea that anytime one uses a device, there is a moment where there’s a handoff. And there’s luck and chance involved in that time. But we could also see a world where we’re doing that handoff much earlier and in a much more programmatic and larger way. So I think the it’s really important now to watch the conflict in Ukraine, to think about how we might start using computing in how adversaries could push us into more and more aggressive uses of machine learning based systems to make decisions surrounding coercion.
[Eric] 59:23
So in addition to fears that we may have about machine learning models, like ChatGPT being used for warfare, what are other potential downsides of ChatGPT specifically, do you foresee?
[Professor Hoofnagle] 59:43
One clear thing that’s going to happen with ChatGPT is that marketers are going to use it to create every possible narrative frame surrounding products and even individualized pitches based on those narrative frames. So when we think about things like class action remedies for false advertising or illegal marketing, defendants are going to have very powerful arguments that there’s no commonality. That the ad that you saw was not only different than the ad that I saw, it had a different narrative frame. You know, maybe the car got advertised to you based on how fast it was, but it was advertised to me based on it having tinted windows or something. So we could imagine a world where personalized advertising actually happens. It doesn’t happen now. But we can imagine one where it happens on a literal one to one basis, and then there’s no commonality. And then, you know, panning out, you could think about what that means for a society. When there’s no commonality of experience. It’s either a utopia or a dystopia.
[Isabel] 1:00:55
That’s a very scary concept.
[Eric] 1:00:57
No, I think it’s a really scary concept. I mean, think about when politicians start using the technology for more personalized ads. Who are we are we already saw a little taste of that back in 2016, when the Cambridge analytical scandal broke, but this new technology, I mean, it could be on a whole new level.
[Professor Hoofnagle] 1:01:18
Yeah. And so at least in advertising, we have false advertising law. So there are we know that there’s some things they’re not allowed to say. But when you turn to, you know, political, the political world into the religious world, the First Amendment greatly restrains how the government can regulate inducement and speech. So like, so here’s an example. Imagine what cult leaders will do with this. Right? Unbound by the political machine because they’re not elected. Unbound by commerce because they’re creating personality. You know, just imagine what people who want to start a cult will do with these types of technologies.
[Eric] 1:02:08
Yeah, wow. Hopefully, they’re not listening to this podcast, for starters. So okay, so we’ve talked about ChatGPTs potential downsides. We’ve talked about machine learning and warfare, education, the copyright issues that are brought forth by this new technology, talked about ChatGPT’s use in everyday life. And just to wrap up this wonderful conversation that we’ve had today, I’m wondering, Chris, what about ChatGPT and similar machine learning models are you personally most excited about?
[Professor Hoofnagle] 1:02:44
I think ChatGPT is going to open up a world of possibilities for people and businesses to communicate with each other better, to understand the world better to learn about things that they’re interested in, it’s going to be like the search revolution. And just as the search revolution brought us tons of junk, spam results, scan results, bias and so on. Similar things are gonna happen with generative models, and we’re gonna have to be on the lookout. I mean, I think the good news is, is that companies like OpenAI understand it. I’m not sure, like in the 1990s, there was such a kind of perception of technology optimism, that people just systematically ignored, downsides. They didn’t even think that there could be downsides. It seems like we’ve grown up a great deal. And that’s going to guide how we use and implement these systems.
[Eric] 1:03:48
Thank you so much for joining us, Chris. It’s been a really insightful conversation about ChatGPT, about its past about the present, and about the future.
[Professor Hoofnagle] 1:03:57
It’s my pleasure. Thank you for having me.
[Isabel] 1:04:07
Thank you for listening! The BTLJ Podcast is brought to you by us, Podcast Editors, Isabel Jones and Eric Ahern. Our Executive Producers are BTLJ Senior Online Content Editors, Catherine Wang and Al Malecha. BTLJ’s Editors in Chief are Dylan Houle and Jessica Li.
[Eric] 1:04:28
If you enjoyed our podcast, please support us by subscribing and rating us on Apple podcasts, Spotify, or wherever you listen to your podcasts. If you have any questions, comments, or suggestions, write us at btljpodcast@gmail.com.
[Isabel] 1:04:43
This interview was recorded on February 9, 2023. The information presented here does not constitute legal advice. This podcast is intended for academic entertainment purposes only.
Further reading and references:
Sabrina Ortiz, What is ChatGPT and why does it matter? Here’s everything you need to know, ZDNET (Feb. 16, 2023), https://www.zdnet.com/article/what-is-chatgpt-and-why-does-it-matter-heres-everything-you-need-to-know/.
ChatGPT: Optimizing Language Models for Dialogue, OpenAI (2023), https://openai.com/blog/chatgpt/.
Christianna Silva, Everything you need to know about ChatGPT, Mashable (Feb. 1, 2023), https://mashable.com/article/what-is-chatgpt.
ChatGPT: Optimizing Language Models for Dialogue, OpenAI (2023), https://openai.com/blog/chatgpt/.
Michael Barbaro, Did Artificial Intelligence Just Get Too Smart?, The Daily (Dec. 16, 2022, 7:00 AM), https://www.nytimes.com/2022/12/16/podcasts/the-daily/chatgpt-openai-artificial-intelligence.html.
Samantha Murphy Kelly, ChatGPT passes exams from law and business schools, CNN (Jan. 26, 2023m 1:35 PM), https://www.cnn.com/2023/01/26/tech/chatgpt-passes-exams/index.html.
Naruto v. Slater, 888 F.3d 418, 426 (9th Cir. 2018).