ChatGPT, Seriously: Real, (Mostly) Non-Cringey Uses That Make Journalism Better
6:00PM Aug 26, 2023
Speakers:
Keywords:
gpt
questions
chat
ai
write
policies
code
journalism
data
give
put
prompt
thinking
started
essays
tool
bit
work
wanted
published
Welcome to this ours featured session. The show will begin in just a few moments. Please find your seat
Love
Love
please take your seats our show is about to begin
Welcome to chat GPT seriously real, mostly non cringy uses that make journalism better. Please welcome to the stage II thar l Caetani.
Afternoon How's everybody doing their last featured session? I know everybody a lot of energy. Hopefully you've all had lunch. Have been awake since 6am are excited and are excited for our last featured session of the day. I'm Iftar Oh, I have a title. I work at Bloomberg as news product strategy lead. This is my sixth OMA and I'm running for the board this year. So you should all make sure to vote in general read everybody's profiles and vote. And I'm excited to introduce our panel today. First up, we have our moderator, Gideon Litchfield, who is the editor in chief at for all editions of wired. He left UK 25 years ago Good Good choice. Went to a whole bunch of places before he moved to the US. He was editor in chief at MIT Technology. One of the founding editors at quartz and did a whole lot of roles at the economist give him a round of applause.
I just want to say I am as of 5pm. Yesterday, the X editor in chief of wired.
Do we say congratulations. Okay, congratulations, another round of applause. I'm sure we're excited to hear what happened or how you ended up. Next up, we have Andrew Calderon, computational journalist at the Marshall Project is also a junk professor at The New School in New York City. He teaches data design and community engagement. Next up, no introduction needed we'll still give you one cc way editor in chief at the markup. I just thought I'd do you know not all the guys and then CC at the end. Markup obviously nonprofit news publication focused on the impact of technology on society. She was also co executive director of open News found at the DI Coalition for anti racist credible and just newsrooms, assistant managing editor at ProPublica. Worked at the journal AP all the places I've mentioned. I want to work on all of them. And then our last panelist today, Mark Hanson, who only wanted me to say he teaches at Columbia Journalism School. He's Inaugural Director of the Brown Institute for media innovation professor at UCLA to do a little bit, teaches advanced data analysis and computational journalism. He has a dog called TIG and loves landscape architecture. Please give them all a round of applause. I think they're going to put up a link up there where you can add your questions or you also have the link. Please add your questions throughout the session. I'll be moderating them and then at the end, we'll have a 15 minute q&a for questions. Thank you.
Thank you so much he thorough and thank you to my panelists. Thank you all for being here and for joining us on this last day of RNA. As I said, I am officially now editor in chief of wired as I announced a few months ago I'm leaving to focus on questions around the future of democracy and governance. But first I'm taking a break so you won't hear anything from me for a while. While I was still at Wired. Early in the year I became very briefly journalism famous in a very, very small way because why it was the first publication to put out a policy on using generative AI in the newsroom and the policy mainly said don't use it. And to be fair, I was reacting a little bit to the explosion and excitement about it but also the very cringe worthy and rather embarrassing thing that happened to one or two news outlets I won't name that started writing stories with generative AI only to discover they were full of mistakes. And so I wanted to set clear guardrails around what we were doing with AI mostly so that our readers would know what to expect and would know that our content was still trustworthy and also so that our journalists had a sense of what the ethics around using generative AI and the newsroom shouldn't be. But at the same time, we knew that we wanted to experiment with it and see what it could be good for. How could we apply it in ways that would not just expose everything we did to fat confabulation and hallucination that would introduce errors, or that would just produce mediocre copy, which is what Chad GPT does a lot of the time. So, over the last few months, I've been aware of people in many many newsrooms doing experiments with GPT in one way or another to try to see where it can be useful and where it can fit into our workflows. And so I'm really excited to have a panel of people who've been doing that and been doing it in really interesting ways for serious investigative journalism and analysis. And I've asked each of them to talk a little bit about some examples of how they have been applying it and the work that they have been doing and what they've been learning and after that we'll have we'll turn it into more of a discussion about other possible uses, where the guardrails should be what to watch out for some tips for anyone else who wants to try to apply it and then we'll have some time for questions at the end. So I am going to ask Andrew Calderon to start by talking about what they've been doing at the Marshall Project.
Yes. Okay. So hi. It's very nice to be here with you all. I'm actually going to hop over to the podium. I prepared a couple of slides that I'm going to breeze through I was asked to put the microphone down. Okay, so I'm gonna put this up already. Okay, cool. Um, so, Adam, so the Marshall Project, for those of you that don't know is a nonprofit newsroom focused on the criminal justice system in the US. And I want to tell you a little bit about the particular use case that I've been working on. But first, I have to tell you a little bit more about the project that it's really been steeped in. And so we've been covering the issue of banned books. And prisons since last year. And at this point, we've written a few stories about these issues that we're seeing in the carceral system related to books being disallowed and the ways in which people who are incarcerated have difficulty either retaining or accessing books. And this all started when we specifically published or were interested in compiling a list of banned books from every single state across the country. We requested that data, we managed to get 36 lists and published 18 of them which right now are available on the website. And the tool looks more or less like this at this point. There's a state selector, there's a little summary at the top and then you can scroll through or click through all of the different lists that are in your specific states department of corrections banned book list. As a part of that request. We also asked for the publication policies for every single state. And as you can imagine, there were many of them. We managed to get there 50 plus the federal system, we managed to get all of them. And so we as we were talking to people about what to do with the data and what to do with these policies. What we were hearing was that the data was useful, but that what people really wanted was a way to be able to parse through all of these policies and also compare them. And it's important to note that a fundamental part of the way in which we've approached this work is through a co design process where we've been talking to people who are close to the system like prison educators, books to prison programs, people who are forced formerly incarcerated to get a sense of what their needs are. And so when we when we got this feedback that people wanted to be able to understand the policies and wanted to be able to compare them parse through them, it was around the same time that charge GPT really started getting a lot of attention, specifically version 3.5. And so we were like, Okay, maybe this is the tool that we need to be able to address this problem. And I threw Beyonce into the presentation, because I love Beyonce, also, because I have two friends who are at her concert right now in Las Vegas, but also because I actually think that she embodies some of the ways in which we want to approach this work, which is always trying to innovate and try something new, but never forgetting where you come from, and your values. And so as we were thinking about how we could use this tool with all of the questions surrounding it, all of the risks around it. We decided that we would try a small version like a 1.0 version, where we would write notes manually of the policies, the relevant sections that people told us they were interested in and use chat GPT to produce a summary of those notes to save us the time to actually write through them. Eventually, we went back to the people who we spoke to, they gave us feedback. And they told us that this was good but not really great because what they really wanted to be able to do was know each of the features like how to actually think about all of the sections and so we went back to Chad GPT we thought about that process, we iterated on prompts, and eventually were able to come up with a more robust description that had set sub heads with definitions that we can point to, and then we pass the summary the policies back into charge GPT and asked it to group all of the relevant parts of the policies underneath the sub heads. We then had two people go in fact check these against our notes and the policies to make sure that the information was was accurate and along the way and we'll probably talk more about you know, the process in detail. But along the way, basically, the four things that I took away was that it's really important to document everything that you're doing without GPT at this point when you start a process to document the prompts like I made the mistake of one point of deleting some of the prompts that I was using, I realized in retrospect that I shouldn't have done that, um, it's really important to consider risk analysis. So when we got the list, we when we talked to people, we said should we actually release these lists because maybe they might end up harming people. And while we were thinking about using chat GPT that was something that we considered as well and we asked people's opinions, having a co design process in place actually talking to people who are close to the issue that you're interested in is fundamental so that you're not coming. You're not taking a solution and applying it to a problem but you know what the problem is, and then you can think about what the best solution is to it. And then lastly, continuing to iterate on the tools and your approach and prompts. These are kind of four of the takeaways and I can say more about it as we work through the panel. And if you want to dig a little bit deeper into what our process was on generative AI in the newsroom, there's a full write up of like, what our process was the prompts that we use, how we requested information, etc. Hopefully that wasn't too long.
That's perfect. Thank you very much. So I would like to I'll come with one question, and then we'll have more discussion afterwards. But was the process that you developed? It's it sounds like it saves you time in the long run for generating these summaries. How tailored did it have to be to this exact project or put another way, how much of this could then be reproduced or simply replicated? If you weren't, if you were using it for a different set of documents? Would you have to start from scratch?
Yeah, those are interesting questions. So so my editor and I, David EADS, we kind of have different opinions on this. I think that it saved me time. Full stop. Because previously, when I wasn't thinking about using chocolaty GPT, I was saying, Okay, I'm gonna have to read through all of these policies, pull out the sections that are relevant to publications, books, etc. Because they're often mixed in with mailroom policies, then figure out how to synthesize it right through it and charge GPT save me a lot of time because I was able to just do that manual extraction, pass it in, and then get some summaries that I could then Edit. In fact, check. My editor thinks that it probably worked out to being around the same amount of time, but the distribution of labor was different, right? So instead of taking all of my time and putting into being like, oh my god, how am I going to work through all of these policies and I was able to just focus on iterating experimenting and then fact checking, and it ultimately in our opinion, regardless like what wherever you fall kind of on on that analysis led to a better product because we were able to leverage what we're really good at which is verification and fact checking, instead of sitting there getting burnt out because I have to write sentences of like, you know, what is like really boring policy.
Right? Okay. So it's a great example of where it is possible to leverage what humans are really good at and outsource the boring stuff to a machine. Okay, Mark.
All right. Hi. Hi. So, um, I teach at Columbia Journalism School, and a lot of my examples are going to be coming from sort of the teaching process. Journalists are in a kind of a strange position when it comes to things like like chat GPT or or large language models in general, because we're tasked both to report on these technologies, as well as to report with them. They offer us oftentimes new capabilities. But we don't want to end up well. Anyway. There's there's a kind of trade off that has to be done. I tend to think of these models as prototyping platforms for AI systems. They allow us access to, to try things out in ways that we wouldn't have ordinarily been able to without a lot of like hand labelled data, perhaps some machine learning applications that have sort of opaque kind of you know, parameters to set and things like that. This large language models function as as sort of multitask learners that allow for a remarkable set of things that that that can that can that can happen so my experience with with Chad GPT came. Well, GPD three, I guess came in the summer of 2022 with New York State's monkeypox reports. This is how oh, let's think back to last summer. Anyway, um, this is how New York State decided that they were going to publish their their monkeypox numbers. It was sort of a paragraph which said that as of July 14 2022, there are 414 confirmed cases, Bob a lot and every day someone somewhere in the New York State system had to write this out. And it's of course not particularly helpful if you want to track what's happening day over day because you have to somehow take it out of these paragraphs and put it into a CSV. It there's a lot of fiddly NLP code that you could write to do it. Or you could just say, here's a paragraph coming summarizing the latest monkeypox statistics for New York State give the paragraph and say please create for me a CSV and here's what the header looks like. And it just gives you the data and as soon as it did that, I went oh, this is certainly worth the price of admission, right. Translating unstructured data into structured data is that is a thing that we end up in data journalism that we do so often. Anyway. You can give it multiple days it will pull things out like nice date formatting this time as well. You can also do something a little bit dicey here which is to say, hey, here are a few days. What are the biggest changes in month where are the biggest changes in monkeypox occurring? And then it starts to sort of give you a paragraph and sometimes that paragraphs alright sometimes that paragraphs you can make it a little bit better if in the prompt you add is monkeypox on the rise and then you add let's think step by step. And then what it will produce for you is a series of of plausible explanations as to how it's coming to its conclusion even though none of those have anything to do with the actual conclusion that's derived, but it has been shown that if you put up something like that in the prompt, let's think step by step your answer tends to be more accurate. The next thing you can do is to say well, let's see. I then asked it to say take that CSV that you've created and create an end the paragraphs and create a template. You know, we have and we felt comfortable with sort of old school automation, which was templated stories. So maybe have chat GBT, if we're not comfortable with it writing directly. Maybe we're comfortable with having it write a template for us. So here in this case, we're asking for Jinja to template and some code to run it. So here's the code running and it'll give you a little paragraph that now you can deploy on your system because you trust that one that's going from data through a template out to the world as opposed to from data to check GBT to the world where there can be confabulations and that sort of thing. So the kinds of tasks that we've been playing with like translating tasks, structuring data tagging, data, clustering data, writing code, interviewing datasets, the number of things for newsroom applications are are extreme. In terms of translation, even, you know, we can think of like translating our stories from French to English or whatever it might be. You can also view the durable one of the durable products of data journalism is a translation problem. Web scraping, right, so I want to translate from HTML into a JSON collection of JSON objects. And scraper ghosts basically does this and it works really well in a remarkable number of cases, just translating out of HTML into something that's more more usable, like a table or a JSON object. So that's the relationship to two language tasks. I teach the computational journalism classes so there's a whole hoo ha around code and teaching students how to code. And you know, we use CO lab and CO lab now is going to have a little Generate button at the top. So the students are being invited to have an AI generate a first stab at Code for them. Chat GPT is already earlier last month, released its code interpreter. So here for example, I'm basically uploading new york's monkeypox numbers and then asking for a chart of daily diagnosis counts. And it starts by saying, Okay, I'll do that. But let me first load in the data and have a look. The nice thing about this particular application is that you end up getting code and it shows you the code that it's running. So the temp the pastor with a template that we talked about and now having the system extra create code for you. These are ways that you can check if something's going to work or not before you deploy it. You can have a look at the code you can say yeah, that looks alright, of course, for teaching students how to code it means that they have to know how to code already but anyway, you get the point workflows are what's important. And I think that's going to be happening increasingly as you have systems like Lang chain or open API, open API rather, started their call or introduced callable functions, which basically lets you reach out to any API on the web that you want to wrap up and, and then now the, the AI has access to a lot more datasets and not just what it was trained on. And what you'll end up getting or what you'll end up doing is using the AI as a reasoning tool, like I need to do this and then I need to get this and then I need to go grab this. And you'll have a record of the of the different API's that it called. So you can verify those, even though there's still a little bit of magic about why it's calling each of those. you'll at least have something that you can that you can check. Let's see and to echo and then I'll finish. Andrews point about about sort of socializing and like working together. The work that you're doing in GPT is sort of programming without a programming language. Right? And that means that there's a lot of implicit knowledge that gets built up implicit knowledge, like, do I give GPT some examples in my prompt, or do I ask it to, you know, think step by step or do I have it preface its answer with I think it's because if you do that it has more a more sincere expression of uncertainty. Or are there ways for me to make it a little bit less biased? Or maybe I can just buy 3000 prompts from this guy, but but this Facebook guy, but um, but my point is that there's a lot of experiences getting built up and there should be sort of, like social moments around that around that experience. Yeah, so I'm going to end there and just also comment that it's National Dog Day, and that's my dog. All right. Thank you.
Thank you, Mark. Before we started, he told me that he wants you had a presentation of 10 to 15 minutes and I'm very grateful to him for compressing it. Thank you. A couple of things that I want to call up before we bring CC up to the stages that I think what both mark and Andrew have talked about, highlights what I think of as the granularity of the granularity of where LLM are useful and whether or not because when they first burst onto the scene, or one shot GPT first burst onto the scene. Everyone was really impressed Wow, you can get it to write rewrites National Science Foundation, report in the style of Shakespeare's on it or things like that. But if you actually need it to be useful and reliable, it starts to break down. I think what both Andrew and Mark have shown are examples of how you have to how you have to get quite specific and granular to figure out which tasks it is actually useful for and saves you time. So with the Marshall Project, it was you can't trust it to read reports and summarize them, but you can trust it to turn the notes that a human reader took into comprehensible text. And with the data from monkey box, you can trust it to extract numbers from a paragraph, but you have to be a little bit more careful if you want it to infer trends in those numbers. But there are ways as Mark was talking about to tweak the prompt so that it's it's it can it is more reliable in things like inferring trends. And so it seems to me like what you're saying mark there is that there is effectively a whole programming language and a programming culture emerging around chat GPT but unlike traditional coding languages, it's imprecise. You can't you know what, what makes a prompt more likely to lead to reliable results. It's a bit of an art saying, Let's go step by step. It's not logical and obvious that that would have been the case. Alright, so with that, I'll ask CC to come up and speak.
Thanks Mark for starting the clapping for me trend. Okay, so I will, I will go a slightly different direction that I will share one way that the markup is using chat GPT but I wanted to see really quick if people by a show of hands could just tell me if you yourself, have used chat GPT for any reason, just raise your hand if that's the case. Okay, and then I want you to keep your hands up. This will be I won't ask you to do this for too long. I want you to keep your hand up if you have used it for work specifically. Okay, that's pretty good. Okay, thanks, everybody. So the way that we like to sort of think about this is that if you're thinking about, you know, coming from a world in which journalists have been using programming software data, already to do their work, what is chat GPT here to offer that's different. And I think you've seen some pretty good examples of things that it can do faster, though, you have to sort of audit the results a little bit than if you had to write the code yourself. So like Mark mentioned this before, like if you really wanted to get that monkey pox sentence out, you could spend a really long time coding it if attributes you didn't exist yet, and then eventually getting the results but it can do it a lot faster. One thing that I've sort of noticed is that this sort of like using chat GBT, for the really serious investigative journalism stuff comes from a place where you also practice using it for the not serious, fun personal stuff. So that's one of the reasons why I asked this question. So I'll start out with the serious stuff that we are using for it now. But I also want to give an example of not serious stuff because like all tools, it requires kind of practice. And if you practice starting with the most intensive high stakes thing you could do with it. It doesn't leave you a lot of room to sort of experiment and fail and you know, learn the little quirks. So one of the recent stories the markup is done, is about how you know yet again, there's something going on with AI that's not fair and it's a little biased and the particular story that we've been following is the fact that when it comes to this idea of cheating, right, like when chat GPT first came out, everybody was like, oh my god, students are gonna cheat. Now that has moved on to this moment in which all of the classic software that's made available to universities is incorporating AI detection tools. Right like to just didn't turn in something that includes something that it thinks it was written by AI. And there was a study published recently, I guess I should pull it up. But there was a study published recently, and we talked to some folks and did some testing ourselves as well, about the fact that when it comes to people who are non native English speakers, they are much more likely to be flagged as their writing is secretly AI even though they wrote it themselves. And the implication of this right in schools was that we even talked to professors who didn't necessarily believe it, but because it was flagged, they had to have a conversation with their student. And students were able to prove again and again that they did the work. And they even talked to a professor in which he literally worked with the student on the outline of the essay everything and then Chattahoochee flagged it or sorry, not touching it, but the specific tools had flagged it as AI. And so the paper itself did an experiment that was basically about how, why is this the case? And so it compared TOEFL essays, which is the essays that students the test that students need to take to prove their English language proficiency if they want to come into the US and study. And they compared those to just regular like eighth grade kid essays in America and tried to see if there was a difference because if you're just thinking about sort of like proficiency with the language, is that it? Well, it turns out eighth grade essays from kids who are in America right now, their essays are flagged way less often than those TOEFL entrance exam essays. And at the end of the day, it was about sentence structure. And if you think about how chat GPT was created, right, it's about the most common language uses that we see over and over again everywhere. And so as you are learning English yourself as a human being and then writing your essays in English, it mirrors more closely to how chat up tea writes English. And so the students were having their tests and essays scored automatically saying it was written by AI when it wasn't. So here's where I give you the example of how one could use Track GPG. So something that we're looking into right now because we have access to some of these tools, right, is what else can we sort of figure out with this and instead of it as a data technique, per se, what we're looking into is having GPT generate a bunch of essays and different styles for us for us to test, as opposed to using sort of real student essays and seeing if by tweaking certain things, the actual tool itself will give different results. And so this is just a different sort of angle of using chat GPT to give you high volumes of content that you can shape to actually audit a totally separate piece of technology. And that's a lot of the sort of interest that the markup has had so far in terms of how we can use it. And so with that, I'll also share with you a fun example because our CEO newbie has said is brilliant at using chat GBT for personal things. So for folks out there who have perhaps young children in your life, that love storytime, and would really love for you to tell them a story on a whim. Well, we actually ended up doing a piece on this, called The Very Hungry algorithm bedtime with Chad GPT nebbia has two very young kids and now they are very into custom stories written for them based on what they're interested in that are bedtime stories. And you can tell chat GBT something like write me a story about and then you can madlib it like astronauts, but in 1850 in the style of Berenstein Bears, and it will write you a pretty delightful story that you can then read and now they sort of play this game of Mad Libs of like what is what are the two nouns or what's the noun in the place right, and then ask catch up to generate that story for them. And I shared this just because like I think we need the whimsy in order to also innovate and come up with the sorts of techniques that we want to use on the things in our journalism, and I just thought this was like a very, very fun example. And then I'll share one last thing. For those of you out there especially because we're at RNA. If you're on tick tock, I feel like the RNA crowd is more likely to be on tick tock and you want more examples of how anyone can use tragic beauty or AI overall just to sort of like, brainstorm and get ideas out there. I highly recommend going to tick tock for this reason, this particular account, I love Rachel woods from the AI exchange. She'll teach you pretty much anything you want. All in the length of a tick tock video. So I just want to throw that out there as an example of the abundant amount of accessible resources on this it can't be overwhelming. So you know, don't like go down a black hole. But there's a lot out there. And so if you are feeling like translating, playing around with it into your projects has been difficult. I highly recommend just casually browsing this kind of thing. That's it.
Thank you Thank you very much easy, so it's important to have fun. All right. Maybe we'll start by just I'm curious what each of you picked up on or thought was interesting about what the others were saying like what? What stood out to you what what ideas did it give you possible future applications, additional things or questions to each other about the projects you've been working on?
So I really love the way that CeCe framed like the whimsy, but also because I was I was having trouble thinking about like how to talk about what we did without making it seem inaccessible. And so I just want to give a couple of other examples of the way in which so for example, I've used chat GPT to simulate conversation from different perspectives on an issue. I've also used it to tell me, we were thinking about building a really small like Slack bot, and I have never done that before and I I'm very comfortable coding and I'm sure if I started looking at the documentation I could figure it out. But I was like I don't even know how to think about building it. And so I explained to try GPT what I wanted to do and so I just explained to me what the steps would be to do this thing. And that's a let's just say a genre in Chachi btw that I've used on countless occasions for other things putting together a presentation and maybe I should have used it for this presentation. And and so as as a way of testing my knowledge and also helping me kind of create a meta structure around some of the things that I'm unfamiliar with to even get a sense of where to start.
I'll add one more just because I'm inspired by this. But you know, all of you probably understand that the art of headline writing is definitely an art. And while I'm not willing to let chat UBT read our articles ahead of time and then tell me what headline it would write. As we are writing headlines for investigative pieces. You know how tricky it is right? It's like exactly the right word needs to be used. And I'm often like, could we have done better? And so after we publish, I will run the published piece through chat GPT and I'll ask for it to write like 10 headlines. And now it's really just become an affirmative therapy tool where it can't do better than what we did. And so I feel great about that. But the moment that it is able to provide something that I will learn from that right to be like, Oh, we could have we could have been creative in this way. And like I I find it pretty affirming when I'm like ah Chachi he can't get the nuance of why I can't say that over and over or you know, so I'll just throw that out there.
I can add one. So we, for the projects, we funded our institute, there's usually like a poster attached to it every year like a thumbnail. And this year I wanted to do monochrome landscapes, right. These are satellite images that are basically one color because from the standpoint of like generative image work, right, these are like null spaces, right? Because there's no feature to them. And so, but satellite imagery is not organized according to color. Or you can say give me all the blue areas or whatever. So but you can ask chat GPT just for the hell of it heck sorry heck of it. And so please, you know, give me a monochrome lens you know, places of monographs, you know, the orange, orange and yellow and black and green and white and pink and, and, and it found a place for each of them. Even the pink there's a lake Hillier in Australia that is entirely pink. And that was like, it gave the lat long and it explained why it happened. It just want to emphasize here by the way, this like moment where we're sharing something and trying to kind of come to terms with the technology is is a sort of important thing. I think that you know, at Columbia, we're starting to organize. Initially they're supposed to be the second Tuesday of every month or Thursday rather of every month, second Thursday every month. S T E was awesome. But it has to be like Wednesdays instead. So it's swing doesn't quite work, but it's just where we get together and and people have their own projects, or maybe we'll suggest one for the evening. And everyone has that moment. Like you'll be sitting next to someone and you'll have done something and you'll turn the laptop and go look what this just did. And the point is like, like because I'm not some crazy techno positivist but like the point is to is to start to unlock some of that implicit knowledge to share something to say, you know, well why did that work the way it did? There's there's something about the kind of socializing that that that programming process that I think is important.
Well, so I think where we wanted to go with that is my sense so far is that all of the uses that seem useful for LLM in journalism are about I guess, I don't I don't know if I want to call them incremental. But they're improvements to work flow, they help with a specific bit of a process. Does any of you see something? I don't know, I guess more transformative, that kind of really is a force multiplier for journalists in a serious way or allows us to do things that we couldn't have done before.
Well, I mean, going back to the example that I gave, like, I think if you scale that up a bit, it is a force multiplier. So for example, like taking the documents that are already publicly available and then translating them for a specific audience at you know, hopefully, you're reading level that's accessible to them. Maybe, you know, we, for example, like we're doing coverage in Cleveland, and there's a very large rate of illiteracy there. And so we've been thinking too about like, what does it mean to create content for that audience? What does it mean to maybe have an audio version? of some of these things? Because that's the way in which some people consume and so anyway, like, I think translating like the, you know, the phrase I've been using as decoding bureaucracy for people is a really tremendous application of this. That doesn't force us to have to put in information that maybe then the model is going to absorb or you know, ingest and that open AI is going to use in ways that we're skeptical about, I think there's tremendous territory for us to cover just in terms of publicly available information that could potentially be a force multiplier.
Yeah. Sorry. Well, that makes me think of a little frustration I had a little while ago, I was trying to figure out what the recycling policy was. We're in Berkeley, California, where I was spending some time and it's actually surprisingly hard to find out what do you do with a milk carton and Berkeley? And there are dozens of sites that will claim to tell you the answer, and you can't really trust any of them. And you know, to what Mark was saying earlier about unstructured data, there is a huge amount of unstructured data out on the internet. And it is all in different formats and styles and different websites. But something like something like recycling policies across all of the countries, municipalities, if you could find a way to extract that data, that could be a really useful tool. Is that does that feel like something plausible?
Mark, yes. I don't not knowing anything about the contours of that question. Sure. I think I I want to sort of underscore that. There's a little bit of minimizing going on in the way you phrased your previous question. Sure. And I'm not sure that that's that's right. I mean, I think there are some things that there are certain types of code now that I will not write again, right. There's like fiddly NLP stuff that I will just not write again, because the because that because Chad TPT does it well enough that I don't have to worry about it. I've seen our students you know, like just last year, right? They started the year September. You know, collab notebooks open or Jupyter notebooks or whatever it was, you know, learning Python, whatever and then over the course of the year, you know, either Cody or or co pilot or, you know, one of these other systems like these coding assists like like cropping up, right and the places that they could get to education wise because they were able to ask questions about error state, you know, about error like, you know, ours error. Reporting is horrible, right, the error messages that you get and and so having something like clarify what that means is amazing. And I also think, to CC's point from the headlines. There's also something that happens when you play off our professional practice against what this thing does. Right? And it throws us back to thinking about how we're doing our our jobs that that throws us back to thinking about labor and kind of a fundamental way and and what does it mean when a human's doing it versus this, you know, a computer, you know, where is inspiration coming from? Where's all that like, like, I think it forces us to stop and think about some of those processes, like, like, like you had done for when you set sort of Wired's publication standards. We have to come back and think about how we do things.
Has anything that any of you have seen in the experience you've been doing, given you pause or made you say, Oh, this is something we could do, but we really shouldn't do.
I mean, many things. I think, you know, this is related to the question you just asked as well, which is, there are transformative things that I can imagine and rather look forward to that I don't trust LLM to be able to do today, but I think that they could write but they just have all sorts of issues. So it's sort of like we're not willing to use it for output currently, because it's just not output, right? Like the things we actually publish because it's not good enough. for that. And it's proven that over and over again, but it's really good at the ideation phase, the helping you sort of process phase and that is powerful and accelerating. In terms of when I imagine that would be really nice is like, you know, chat GVT is known for all of the bias issues that has because of what it gets to train off of, right? Like it's just a human problem, right? Like we cause that problem. But if you can imagine a world in which you could do at scale, the types of things that human reporters have tried to do recently. So for example, a couple of years ago, the LA Times had published a piece in there sort of like food section with the assumption that as a reader, you knew what certain types of Chinese food were already and didn't need that explained to you and then added as the second annotation layer, the explanation right versus the original default of let's explain this to everyone because our assumption is that our readers won't get it. Imagine if you could do that at scale for all of your journalism, right write it for exactly the person reading it, and having those iterations of it having plain language versions of all of our articles so that they're accessible, right, like, those are things that we could do but are still manually intensive if we wanted to do it for everything in in every way. But I can't trust chat GPT to automate that yet. But I think there is a world in which as it gets better as we decided to train it better and to to sort of create less of that bias and and give it sort of more to work from there is a world in which suddenly our journalism could become way more accessible than it is now.
Yep. Use that. I wouldn't be excited. For and I sort of conceiving possible, would be a way to a way for a layperson to get access to to query a specialized body of knowledge. Meaning for example, if I want to ask like a simple dumb question about Supreme Court precedent around some kind of issue or I want it to teach me, Kent concept of the categorical imperative, in simple terms, like for high school, or you know, things, things like that, where you have instead of having to read through a whole bunch of knowledge to get familiar with it, you can ask a specific question and learn does that feel like something that we might expect?
I think so. Yeah, we've been talking at the Marshall Project. I mean, not to say that's something we're ready to do by any means. But just as we've brainstormed, that's something that we've thought about, you know, both making love you know, because the criminal justice system is so fragmented and so vast and so localized like what would it mean to make some of these very complex policies accessible to people and query Google, you know, and to really make it actionable for someone to ask questions. And then just to take a step actually back because I wanted to add some things so for example, with my example of the like, policies and making them comparable, the reason that even came up was because somebody who's a prison educator said, what would be really great as I if I could go to the warden and say, Alabama is doing this or New York, Alabama, North Carolina are doing this, why don't we have an appeals policy? Why don't we have this caveat built in and if they could point to where in the policy that's happening, it creates all kinds of opportunities for accountability, just by the sheer fact of making it easier to work through the, the, you know, these documents in these roles not to hammer on the point, but just
I know Ethan has some questions for us from the audience, but actually want to ask just one more before we go to that which comes back to what you said. Mark earlier about learning to give prompts that make it more accurate, and learning basically, what is the implicit coding language of prompt engineering? The question I have is how does how do you see a community of practice and of knowledge emerging around using HoloLens for Journalism which would be the you know, the prompt engineering equivalent of GitHub, right of having code repositories that anyone can tap into to understand how to get get results?
I mean, there's a lot of work already happening on GitHub. And it's sort of fun to read through, like scraper goes, for example, or any of these, like things that have a call to open API's API at the core, because you search around the GitHub for a little bit and eventually you find the prompt, right and it's, it's you are something or other because you should give it a personality so you are something or other you would like to do this. I would like you to and and and like you know where where there's all the Python and all that stuff around it. Eventually there's just the thing that gets sent to chat GPT or to GPT for whatever it might be that it's just like, sort of really beautifully, beautifully written thing. I will say that that there have been a number of attempts in the HCI community to try to or HCI researchers to try to to put some structure like some programming structure to try to make templates for how you would talk to to a large language model to make sure that it like it performs well. So that it starts to be maybe something in between a formal coding language and just natural language. But I'm not sure how that will play out. All right.
I believe you have some audience questions. Yeah, a
couple of questions and feel free we still have 15 minutes to add more questions, if they've popped up for you. One question from Anna J at the Thomson Reuters Foundation. Do you ever crowdsource a solution from multiple mmm MLMs like Google bar chat GPT and others, or is chat GPT the only one they should be bothering with for now?
I think you should definitely try not looking looking at you because you ask the question, although I know that there's a person somewhere else who owns the question. So it's a very it's a very disconcert there. You're Hello. I think you should definitely try it because they have different different characteristics like the new version of Claude allows you to put there's like a huge context window right? And so you can try things out that are much longer than you could with with with others. And so that that's important, and I think there was a rumor for a while that GPT four isn't in fact, just one large language model. It's like eight of them that are voting. So it's like a mixture of experts. So instead of having one gargantuan model, it's sort of eight smaller models that yeah, sorry.
Okay, another question. Someone who did not want to be identified for using chapter GPT at work, smaller things using SEO summaries subject lines, but not having leadership still being on board and having to hide it from them. Like I'm watching porn or something. What's the elevator pitch to get more legacy more traditional? onboard?
That's a good question. Okay, well, I'll, I'll tackle this in one way because it's kind of, you know, the core part of this is like, why trying to try to get at the answer of like, why is leadership not on board? And is it because they're worried about publishing inaccuracies? Is it because they don't know enough about the tool like that, that plays a part, but I'll pick just one thing, which is that it's possible that a great way to convince people to show them a way in which it could actually be really useful for them as leadership. So I'll give you an example. And this is something that I can't take credit for. I think Siri carpenter who is here from the open notebook, I'm on her board and I got to hear her talk about a way she used chat GPT that I thought was brilliant, which was as she was looking into fundraising, and who she could ask to be a funder of the open notebook, which is all about creating resources for science journalists, right. She knows who her current funders are. And she asked Chad GBT to basically go look for all of the funders that basically like have a high chance of funding open notebook and then ranking them by like, how close they might be, and then she could validate it by seeing if whether or not the people who already fund her are at the top. And that was the case. And so then she could go down and research everybody, right? And like, that's a brilliant use of it. And I think one of those things is like if if one of the blockers for leadership is that they don't understand it, etc, you can, you know, figure out how to get past that but having a good example of how leadership could use it for their job might also do the same peeking of the curiosity that allows people to sort of open up a little more.
Also just plug in that we put together a resource doc that's kind of the design is need based on like I am and there's a specific section for exactly this use case with a bunch of links that we've thrown in there. I don't know where to find the link to that resource doc, but it exists.
I think that link we will post it up and it is going to I think it's on the q&a page, right? Yes.
The other thing is you have to be really, you have to remember that the stuff you put into one of these systems is like going out of your control, right? So it's going to a company and that company has you know published rules about how it's going to use your data or whatever. It may use your data to improve its functions and what have you. And so employers may have really good reasons why they don't want you using these things for work applications because you're sending valuable data off. And that, you know, that's something to consider.
Yeah, 100% like, one way potentially is to help sort of figure out how to define those things. Like, for example, I would never let our reporters put our drafts into chat. GPT, right, because I have no guarantee about what's going to happen once that is out there.
Yeah, I mean, I had a little experience recently where I wrote I wrote a piece. And then for just for kicks, I put it through GPT four and asked it to edit it down because I wanted it to be shorter. And the first thing I learned is that it can't count. And I said edit it, edit this down to 525 words and it gave me 730 So then I said alright, try try it. Or actually I think it was the other way around. It didn't didn't down to 330. And I said okay, I want this longer and it gave me about the same length and I said okay, edit it down to 700 words and it gave me 450 So I had to kind of approach that approach the number by guesswork and the second was the edits that it gave me first of all had mistakes in them because it misinterpreted some of what I was saying. And also it wasn't quite in my voice. So I didn't accept any of the edits. And yet it was useful and the reason it was useful was that it caught places where my phrasing was flabby or use too much passive voice or just and I was like, Oh, you're right, I would change that. Let me just think what I would change it to rather than what the machine would change it to. So maybe maybe have leadership, run their things through it and then and then use that as a self critique.
If I could also just add one more strategy, so I wasn't dealing with this culture problem at the Marshall Project at all. But the way that I got to the point now, where we have like a working group, and we're talking about guidelines, is that I would just collaboratively work with reporters and other people with Chachi Beatty, and we would do like reporting exercises together like generating lists of sources and so then there was enough I feel like they're part of how we got to where we are now is that there was enough excitement from different people that I was talking to and was a part of the conversation that it was something that needed to then be addressed at a larger level in the organization and you know, guardrails needed to be set around it for experimentation, etc. And so that could potentially be a strategy.
A good segue, I guess for that, beyond reporters for students from Laila is a student at University. What's the use case for them? Or how would you recommend students learn how to use or experiment with GPS chat DPT if they're still early on about learning about the principles of journalism to begin with, and should they be
I mean, I think this was mentioned before but having a project right to start on i I'm, I'm not I'm not the words person. So I'm not going to be the one to say oh, have it rewrite. What did you call it flabby, something like have it rewrite? My flabby, a horrible thought, but like, like, so I'm not likely to have one of these editorial or language based skills, but instead, here's here's some data that I would like to think about. Here's here's a set of regulations about recycling. How would I deal with that like, like, come to it with a with a project and then see see how it It proves useful or not even everything like how would I approach this project? I'll give you a set of things like it's usually not very inspired when when you get to that point, but it at least shows you stuff you would have thought of and then you feel a little better about it and you get to see how it works. And I would also play with the language a little bit. Sometimes these sorts of non programming programming language like its natural language is a is a really horrific interface. Write small changes in words can have big changes in output and that's just awful, right as a as a thing. And so, figuring out where there are sensitivities, figuring out you know, that sort of thing is is is is another thing and sometimes even asking it, like try, like how would I make this better? And seeing if you can get something that that falls more in line with what it's expecting. And I'm not anthropomorphizing. It doesn't actually expect anything. It is a constructed nonliving thing.
So I brought it into my classroom almost as soon as it was launched, or 3.5. And I actually it reminds me of what you were saying about play. So like the way I introduced it to the cloud, we had a project and everyone had to have a final project at the end and the class is about design and community engagement. But initially, the introduction was just like, here's this thing, sign up if you want to, because some people felt really uncomfortable about it. And I wanted to respect that too. And then just play around with it. And then as a group, we just talked about what people were putting in and what they were getting. And by the answer they were 18 people by the end there were about four or five people that felt so excited about it that they started figuring out ways to build it into their project for ideation for coding help. And so they themselves saw the utility and thought about the problems that they could potentially throw at it, or sometimes I would nudge them if they were excited about it.
Yeah, I'll say one more thing for students, which is I highly recommend using chat GPT to do things that you currently use other things for, for example, instead of using thesaurus.com, use chat GPG I recommend this because when I'm like futzing with language, I frequently like want something to happen and I'm looking at the source or something like that to try to like find the right word, because I'll know it when I see it. And what I've started using chat GPT to do is like okay, I I'm like writing a sentence. Maybe it's for an email or something right? And what I really want is three words that illiterate but I'm missing the last one and I know what that word is supposed to mean. And I'll ask you at GPG for like 10 options for what word could go there but starts with M. Right? And like, it's just, there's something like about the plainness of it that I think will help and for students especially like just across all the random tools that tech already does for you. Just try using some of those things and chatting at because you'll get different ways to customize it and different like funny things that come out of it. I'm pretty sure for my staff as well. One of the most like fun pastimes is just purposely getting chat GPT to make mistakes. It's like really easy if that's your goal, and they love posting the results. Like it's just yeah.
Do we have one more question? We have many, but I know we're coming up on time. So maybe one more. Can you recommend a few specific ways to analyze data in a spreadsheet if you're particularly looking for outliers, patterns, or any other helpful reporting leads that a human could then further explore?
So I had mentioned the GPT plugin code interpreter. It allows you to ask questions of, of a data set you know, through just basic natural language, like you know, make a plot for me or what is the average of or, you know, do a Wilcoxon rank test for anyway but you can there's a variety of things that you can just like ask it for, right and the nice thing about code interpreter is as I mentioned, it produces some Python. So it's it's taking your your request. It's there's some text that comes back that says I think you're looking for this and then it produces some code. And it's always better when you have this thing produced some code because then you know, all right, like I can see what that's supposed to do. I can work through it. So there's there's code interpreter that's going to cost something you have to at the moment sign up for GB GPT plot chat GPT plus. There's also something if you're a Python programmer at all. There's pandas API, which allows you to use the sort of the pandas formalism to ask are, they Panda's base on which to ask questions, again, sort of in natural language about like, what does this look like and what's happening here? So pandas API is nice because it stays wherever your computation is, as opposed to code interpreter where you have to, again, ship your data off to open AI. And so it's step one, well, if you're using kolab, you've already shipped your data off to Google so I don't I don't know who's better at this point, right? And then massive roshambo
Okay, well, we are out of time. But I want to thank Mark and Cece and Andrew very much for this panel
is not in the app right now. But if you go to own a 20 three.journalist.org and look for the session, you will see a link to the resource list that has a bunch of links that these people have put in with all sorts of useful information if you want to put that GBT to work in your newsroom. Thank you. Have a good day.