Beyond Words: Mastering Data Storytelling for Impactful Reporting
6:00PM Aug 23, 2023
Speakers:
Keywords:
data
charts
flourish
story
project
crime
newsroom
questions
put
reporting
reported
database
data visualization
case
visualization
journalism
bit
journalists
people
understand
The deck with
me
already
believe that say goodbye me Amazing amazing
watching the ocean lead you baby up with his low motion who can lay up and then Collins when the bull change bananas and we just didn't kick in a kiss bought his son could be so true to skin in the mouth she got the good vibes when seasons change
hours deep and not only just we continue to do my job not only just me
possessed by alone Oh Who do we start? The ties will turn without us. Oh my God, to get lifelike haiku and the good vibes when he says
not only just me because my job not only just to single
simmer down to been like I'm Jimmy Jam and brown in your shoes. Down
You. Trying hard would you want to be in one place too. Here we go. Here we go. Again. Common law from a B to Z. I'm the one who is
in phase two.
Here we go. Here we go. I'm calling my friend Julie go when we go again. Coming up in you want to be my friend.
If you
change?
Good afternoon everyone. Welcome to the Beyond words mastering data storytelling workshop. I'm Randy Bennett, the Executive Director of External Relations at the University of Florida College of Journalism communications. We are one of the sponsors along with flourish the session. The workshop today is intended to provide insights on using technology to tell data driven stories. In the first half of the workshop. We will share case studies on how professional and student investigative journalists identify collect mind and present data to tell powerful stories. The second half of the workshop will be an interactive presentation on a focused on data visualization tools from flourish and Canva which made it easier to create interactive charts, maps and more without the need to code. They will be using real datasets including one from a case study today. The University of Florida also sponsors the Oakland A's data investigative journalism award. Our panelists in this first hour include an award winner from last year a nominee from this year and one of our students will take questions at the end of the first hour of the workshop. Our first presenter is Stephen rich database editor for investigation the Investigations Unit at the Washington Post, the post one the US data investigative journalism award last year in the large newsroom category for an investigative series examining how much cities pay to resolve police Minda misconduct allegations. At the post. Stephen worked on investigations on virtually every beat, including stories on the National Security Agency policing tax lien policing tax liens, civil forfeiture and school shootings and college athletics. Please welcome Steven rich
Hi, everybody, my name is Steven rich. I'm the Database Editor for investigations at the Washington Post. I've been at the post for a decade now doing data work for investigations. This past year, we won the only data journalism award and I want to walk you through sort of what the story was and how we did it, how we approached these things. And you know if you have any questions, please hold them for the end. I will make the person a personal request there to is if you do have questions, please come up to the microphones here. I am hard of hearing. I can't hear you unless you do that. So the story that we worked on was it came from this broad idea of I want to know how often police departments settle alleged misconduct lawsuits and for how much money and the story itself has been done before we kind of like people in various individual cities have collected this data for those cities and said oh, they paid out this much. I'm in this many lawsuits but the question that I wanted to know started with who was the most costly cop in in individual city and I came to the realization pretty quickly that it's not a really good measure because the most costly cop is the guy who did one awful thing and cost the city $20 million. But the reality is you have some police officers who, who who have these settlements all the time and those are the people we really wanted to get at. So we started this project because we had been doing this forever. The you know, we started with collecting a database of every fatal police shooting across the country, a project that is now in its ninth year. We moved on to cops who were fired and forced to be rehired through arbitration. And then we did a project on unsolved the geography of unsolved murders in major cities. And so we sort of were this was a continuation of this long project. And we really kind of I looked to a story that I did in the first year of the police shootings project, which is we wanted to know how many officers in 2015 had previously shot and killed someone who when they shot and killed someone in 2015 and we found 55 In just that year alone, and so we were like, repeaters is kind of our business. You know, Keith Alexander, who's the lead byline on this story was the reporter on this on the story that one last year. And he and I sort of get together every few years into a story like this, which is just like how cops doing bad things over and over again. So the major findings in this was we found $3.2 billion of settlements across 25 large police departments over a decade, half of the month and the big finding was half of that money was spent on officers who had been involved in multiple claims during that time. We found 7600 officers in those 25 departments who had been involved in multiple payouts. 1200 of whom had been involved in at least five payouts over the course of 10 years. There was one one police officer who at the time that we reported this had 143 settlements made on his behalf. He works for the Philadelphia Police Department. You won't run into him he is on permanent desk duty as they resolved the final 50 cases that are still pending against him. So I want to go over how we did this and sort of give you an idea of of what it takes to do something like this. So I filed 50 open records requests with the 50 largest police departments in the country. And then just kept following up. I set reminders for myself. I put things on my calendar when to make sure that that folks are hitting deadlines. We didn't get all of them, some of them denied us. Some of them wanted way too much money. At the end, we were able to get 25 So we were able to get half of the datasets and standardize them. The hard part came is that most police departments don't actually track who is involved in the payouts that they do. They just know that there was a payout and some of this is because they don't want to know if you don't know that the officer has five payouts. You don't have to fire him. You don't have to punish him. And so I had to go through by hand 22,858 court cases to pull the information on there most of which were on Pacer, so I couldn't scrape it it was just a lot of elbow grease. And then I basically found the top officers and sent them to the reporter who then did a bunch of reporting and we had we reached out to every department to get comment from the officers who had the most of each city. I want to talk about the data issues. And this is sort of one of the things that you will come up against no matter what kind of data set you're dealing with or story that you're dealing with. So obviously the biggest one I've already talked about, which is the cities didn't give us the names of officers. But the hardest thing I've I always have to deal with with pretty much every project is I get data in all sorts of formats. So I got some Excel files. Those are great. I love to get Excel files CSVs Those are great. I get most of them in PDFs. They're awful, but they're not as awful as they used to be. There's a lot of really good software to get data out of PDFs unless they send you an image PDF, which is not you can't just pull the data straight out of and then my favorite was a department that printed out a 655 605 pages of a spreadsheet and mailed it to me getting that into into a usable format was fun. So some of the data, you know was missing fields when you're trying to like do this Frankenstein data monster you kind of end up with that not every there is especially with police departments. There is no there's almost no standardized data from department to department. Even the FBI, the data that the FBI collects is terrible, at least in part because a lot of cities don't. Don't put data into it. One of the things we ran into is some people Sue and win money from police departments without ever knowing the name of the officer and never had the police department never having to turn over the name of the officer. So our numbers are low. The 7600 officers who were in there, there's almost certainly more who were just John Doe's in our data. We often needed secondary data. So we when we got names, it was not names. It was badge numbers and then we had to work make requests for badge numbers to match up with those things. And then we had to figure out when they were working there because multiple people can have the same badge number if they're working in different, like if somebody worked a decade ago and is working now. You may have the same one. So you have to figure out who is who and make sure not to get over those things. But often, you know, the cities and the police departments had conflicting data we'd asked for both of them, like from the finance department and from the police department. And they both tell they'd say here are our settlements for 10 years and they'd have different numbers. And so having to reconcile that was was tough. Like a lot of this is just you you put your head down and you do the reporting. You figure out what's which one's right, which one's wrong and get there. So, I want to talk about generally why we use data why i i find it so good. I think the biggest reason is it adds context. I think we you know, I'm never going to be the person that sits here and like oh yeah, I want a hard lead with numbers in it. That's just going to bore our readers and I love a good story. I love a good anecdote. And what I like to do is to have an anecdote and then write one sentence after that anecdote which is basically, and this is how often this is happening. Because when you can provide your readers with that context, they start to realize that you're not just writing about the one person who is in this really awful situation. You're writing about something that's systemic, and that gives us scale, and the more scale that you have, the more likely you are to get results from from from your reporting. I also love data because it helps find anecdotes I like to you know, I'm able to use the data to say this is the officer you want to go after. Whereas we don't know that ahead of time. We never would have known about the officer in Philadelphia because he's never been written about. And so you know, the data really helped it and I generally, I just think it makes the story better. It differentiates you. Not everybody is using data. So, so that's good. It helps focus the reporting on what's what's the trends and what are outliers. And you know, I don't do the visualization stuff, but I find visualizations very helpful for most readers. I am more of a numb I look at numbers and I can understand them. But I have been told repeatedly by my boss that that's not how other people look at things and so we visualize them. And it often flat helps us to be the skeleton of the story. The thing that I want to stress here is that the most important part of data journalism is the journalism you know if you if you just put a whole bunch of numbers on a page without any sort of context, without any understanding of why the numbers are the way they are, you end up giving readers an incomplete picture of what's going on. So you need to be doing the journalism around the data work. And I find that the data work itself is basically just journalism. I
I use ice datasets as sources. And my job is to ask those sources questions and my job is to ask the questions that get the answers that we need, but also the answers that are that are accurate. And so in this case, the data gave us the roadmap for all of this and it allowed us to ask pointed questions, and to do something that no one else had done. And so it really gave us a whole lot of a lot of extra oomph in the story that we otherwise couldn't have done. You know, my job at the post has often been like being the data Frankenstein maker is what one of my colleagues put it is like my I like to go to a bunch of disparate agencies that are all collecting the same data and put it together to get a national look at things because, frankly, the federal government does a really poor job. Of of keeping data on most things. Policing is a huge one. And so I find these projects to be really gratifying. I want to leave you with a few resources of on the things that I used to do this Pacer, if you don't know is the federal courts website. It is currently it currently costs money to do pretty much anything on there including searching but knock on wood it will maybe be free at some point in the future. I don't know. I use state and local court sites you know you there they are most jurisdictions have their court cases online, you can search for things. One of the notable places that does not have it online and if any of us are from there and do courts reporting, you know, is Cook County, Chicago, Illinois does not have their court cases online, which makes everything harder when we're trying to report about that. I always go to open records portals you never know what somebody has asked for before you and you don't need to file a FOIA if somebody has done that and it's been published already. open data portals are the same way you can find really weird datasets on open data portals, not all of them are helpful. I'm pretty sure the US is open data portal has almost no good information. It's all just no no good information for journalism. It's like a ton of this stuff is from Noah. In this case, we use news sites. You know, we can just like using Nexus and using Google we can find stories on these things and they were able to give us a sort of push in the right direction and a lot of the isn't able to check who and what has done this. But I will leave it there and pass it on to Agnel. I think we're probably happy at the end of this to answer questions. We're trying to leave space. Especially if you have questions about like how to do some of this stuff, like from a technical standpoint.
Thanks, Stephen. Our next presenter is Agnel Phillip, data reporter for ProPublica we just covered Child Welfare indigenous issues, flight safety and criminal justice UPV. She previously worked as a data reporter at the Arizona Republic ProPublica is a finalist for the data journalism award this year and was the winner in the small medium category last year for an investigation to hotspots of toxic air pollution that the EPA has allowed to take root across America. Please welcome Magnum. Thank you.
Thanks, Randy. So a decent chunk of my presentation is going to be I'm going to talk about the story that's nominated but I also kind of want to run through conceptually kind of how we think about approaching a Data Project, especially from an investigative perspective, because I think you know, in 15 minutes, there's only so many hard skills I can leave you with. So at least kind of giving you a roadmap of, of when I evaluate a project that is you know, pitched to me or I'm thinking about a project. This is kind of how I think about the data within it. And as I said, you'll notice there's a lot of crossover with what Steven just just presented. I apologize for that, but it'll be good reiteration. So I think overall, what Steven was saying at the end of his presentation, is kind of the takeaway I want you to have from this data is just like any other source like when you are doing data analysis, you are interviewing a dataset, right? So it's important to be humble and realistic about what the like potential is of doing a data analysis because just like sources, data, won't say lies but it misrepresents it confuses it doesn't necessarily show you exactly what the truth is on the ground. But it is important because you can show scope, you can in some cases uncover impropriety like you can't actually find a story in a dataset. It's hard to do, as I'll get into, but you can do it and it is, you know, an important and growing part of investigative journalism. So, here's kind of some universal truths that I guess I live by when it comes to data and data analysis. First off all data is bad. Okay. A large part of the work of data journalism is figuring out how bad it is, and whether its badness is relevant to your specific project, right? Because like I said, data is like any other source and it's often or almost exclusively inputted, at some point and by human a human touches data in some form. humans make mistakes, humans lie, humans screw up, you know, this all translates into the data in some form or another. So you can't just take things that you do an analysis at face value, especially if you're doing like a very kind of specific analysis. The process of how you figure that out, it looks a lot like traditional journalism, like Stephen was, was talking about the journalism part of this work is very important. You have to talk to people. You have to talk to the custodians of the data, if that's a government agency, if that's a nonprofit, you know, whoever manages inputs, the data, really understand how a data set is created. But you also need to talk to experts in the field who views that data, who's who or who study it so they can help you put it into context and help you identify where the shortcomings are. And all of this is to figure out whether the data you have and the analysis you have matches the on the ground reality, because like I've kind of alluded to with data, if it looks like a duck and it quacks like a duck, oftentimes, it's not actually a duck. So you do have to really do the work to figure that out. And if you're going to use data in especially in an in depth project, you have to incorporate it early if you if you want the best return. So this is how I think about it. I don't know that this is like you'd find this in a data journalism textbook or anything but this is like the two kind of types of how you approach a Data Data Project. And I call them big to small and small to big, so big to small is essentially starting with the data analysis. You have a data set, you've been given a data set, you do some sort of number crunching on and you're like oh, here's the story. I'm going to do the story about X y&z That's kind of how I think about a big the small sort of thing. Small the big which is kind of more of the stuff that I end up doing. And investigative work is where you start with a tip or a story idea, like something a little bit more specific, because it allows you to go out and find the data and then you try to you know, show how often something happens. You try to you know, assign scope to it. And obviously there's a whole spectrum of scenarios. In the middle projects will look like, you know, 25% of one of these approaches 75% of the other vice versa. But the ultimate goal is to have your analysis and reporting complement each other. So, here is kind of a little bit more of a description of the big to small. So like I said mining a data set for for stories. I find this very hard to do in an investigative context, because you don't know what you're looking for. And that's like, when you're doing an analysis like if you think about it in the reporter source kind of relationship, like sometimes you don't know the question that you're actually trying to ask, right? So like oftentimes, you'll get a data set, and you just do these very high level summary analyses and it's not actually that meaningful. You know, whether because of weaknesses in the dataset or like whatever your analysis or doing whatever data point you get out of it's not actually that important, but if you are able to ground truth it which in this case means you report out those examples, and you you know you can't there can be a use for that like the the project that Randy mentioned that when the the award last year from our publication about toxic air pollution. You know, that was a novel analysis, I was taking a huge data set of air pollution throughout the country and saying we're gonna identify all the hotspots and we're gonna do a story about that because that's inherently newsworthy. It's inherently investigative. Like we're just going to go forward with that. I can I can talk more about some of the challenges that they ran into on that project. I didn't personally work on it, but because they started kind of from a big perspective, they did run into some challenges when they actually got to the ground to figure out what the reality was. But that is one way to do it. Another example from my publication that you may have seen as a secret IRS files. If somebody comes to you and gives you a database of confidential tax records, pretty much anything you do it, that's probably going to be a story, right? So that's kind of the two examples that we've done that follow that. The more common one in our line of work is to start with the anecdote. Start with the story idea, because you already know there's a problem to report on there's already like, you know, your your journalist, instinct has kicked in and you're like, I think this is something that needs to be reported on. And this can help you make your analyses much more sophisticated, much more specific and targeted, and it just makes it easier to kind of ground truth, everything and make sure the reality matches up with what your analysis shows. Where you run into problems with this approach is when you try to add data in the middle or the end of a project, which often happens because people think, Oh, we're just gonna do all this reporting, and then the data will come in and it'll, you know, help make the point or whatever, don't do that. Even in this case, it's important to, you know, if you're, if you want data to be a part of this project, or it seems like scope is something that you're going to want to show bring the data folks in early. All right now actually talk about the story that stories that I was here to brought here to talk about. So the series that is up for are a finalist for the the award, kind of had a mix of both of these approaches, which is why I wanted to kind of get that framework for you all to think about there's the small the big and the biggest one I thought they were both kind of successfully executed for their individual projects. I was the data reporter on on pretty much all the stories in this. So this project was called over policing parents. It was a collaboration with NBC news about the child welfare system. And in the United States. And basically, what the idea I'm going out of order on the bullet points on this, the idea and the frame that we were going about this at the outset was
there's been kind of this general awakening and an awareness about racial disparities and issues in the criminal justice system, right and kind of like a fresh skepticism about various institutions and people within the criminal justice system that has maybe really taken off in the last like 1520 years right but that same awakening is not necessarily happened in child welfare. Which is not technically a criminal process, but in many ways, functions very similarly. So our idea was, can we kind of break this, you know, system down into its component parts and use data to say what is going on at each of those levels? Because often when you see a child welfare story, and I know you've all seen this, you'll see the outlier example, you'll see the one that makes you know, the news about horrific abuse or, you know, kids that are in foster care for a really long time or whatever, but it's not actually indicative of what the reality is, for a lot of people who get involved in the system. So to do this, we got a huge data set from the National Data Archive on Child Abuse and Neglect. I'm sorry to not put the name of that on there. But it is a a national data set, basically, where every state reports their child welfare cases up into this huge data set. The thing is you have to get a research license to to use it and there are like this is a very fruitful area. I think for data journalism going forward. There are a lot of datasets that have traditionally only been used by academics and researchers, but like journalists are functionally not really any different from them. So like oftentimes you can get access to those types of datasets, but you do have to read the fine print because for example, the license agreement for the datasets we used, said we had to refer to the datasets in a very specific way credit in a very specific way, whatnot. So just three to 5% on that. And they are they were very big datasets. So without coding knowledge, it was impossible to analyze them. And this was a partnership project. We don't have to get into that. Yeah, I don't know if you've worked on partnerships. They are great, but sometimes they are challenging, happy to answer more questions about that later. So the example of kind of small, the big within this one we had the first story in our series was about mandatory reporters, which if you don't know, there's entire classes of professions and a lot of states that are required to report cases of suspected child abuse or neglect. Which sounds like a great idea in theory like you don't you know, if you see if you see something, say something, Pennsylvania, which had the Jerry Sandusky scandal, in the last decade, decided that they never wanted this to happen again. So they were going to expand the number of professions under this mandatory reporter category and make the penalties a lot stiffer. So we wanted to see does it actually work? Does mandatory reporting work and this was kind of like a good natural experiment. And lo and behold, I'm going to move ahead through a couple of these things for time sake, but happy to talk more about them. Lo and behold, we found through our analysis that when you expand Manders mandatory reporting, and you really make it so that people have to report, you know, suspected cases of child abuse and neglect. What you do get as a lot more reports, what you don't get is not necessarily what you don't necessarily get is a lot more substantiated report. So in this case, sexual abuse cases where there were no more substantiated cases in the five years after the reform happened than in the five years before even though there was a significant increase in the number of reports. So this, this is a small debate because we started with that. idea, right? Like, how, how did this law change play out in Pennsylvania? And what effect did it have? And so I got to talk, I was able to target the analyses of those datasets in a very specific way. The flip side of that was a story we did about termination of parental rights, which for those of you don't know, if you're in foster care and you're going to be adopted, you have to have your parental rights terminated before your the adoption can go through basically, and it's the end of the legal relationship between a parent and child. We had an idea that this was like something we wanted to report on, but we didn't really know what we figured we just count were TP RS as they are acronym. ized happen most often and where they happen most quickly, and you know, wherever kind of sticks out to us in the data. That's where we'll go. So that's what we did. And West Virginia, was an outlier on both of those metrics. So we ended up doing reporting there. It was, it worked in this case, because the frame of our story was like a little bit more explanatory. And the project was like we're examining the child welfare system, not we are trying to, you know, root out the specific case of injustice or whatever. So it was able to work a little bit better in this case, and it did. The challenge was we didn't have any anecdotes so we had to kind of build all that from scratch. We knew we wanted to go with West Virginia and then we had to go and find those cases which if you've ever reported on child welfare, you know is very difficult thing to do, because you don't really have access to documents and such but I thought we got two pretty good stories out of it. Here's kind of the nut graph, at least of the national figures along with some of the findings from West Virginia. So those are, that's, I'm gonna hand it over to Zack.
Hey, thanks, Agnew. Our final presenter is Zachary Cornell, a journalism senior at the University of Florida specializing in enterprise news full stack web development, and data engineering. He will discuss his campus crime project, which allows users to explore an interactive map and database of UF crimes along with any available court records. Going back to 2016. Zachary has one class remaining before graduate graduating and he wants a career in data journalism. So you know, if you want to see him afterwards please welcome Zachary.
Thank you. Thank you, everyone who came today. I'm not used to stage light so I can't really see any of you but I'm assuming there's people here. I I'm very excited and feel very excited and fortunate to be here today that I was asked to talk about some of the projects I've been working on. Honestly, when I chose journalism as a major. I never anticipated that I would get to be doing what I'm doing now. Just because kind of growing up since like sixth grade I've been doing computer programming but it was always kind of more of just like a hobby than a career to me and then throughout high school, I kind of became more of a writer. And so it's been really cool to learn some useful ways to kind of marry my coding skills with my reporting skills at UF. Most recently, that marriage took the form of uf crime.com Which as Randy was saying the interactive map and database of view of crimes going back to 2016. Created by myself and good friend slash fellow student slash future editor of the New York Times Isabelle Douglas, she's great. Before I kind of get into the nuts and bolts of like, what it can really do and how it was made. I wanted to first talk about why it was made, because I think it's a really good story and just kind of how journalists can respond to institutions and keeping them accountable and making data more accessible to the public. The we started with the mean the reason we created it really in the first place, even though us they have their database of their own. With you have crime data going to the past back 60 days or something that yeah 60 days. But the reason we decided to create this is because we you have removed the ability to export historical crime data specifically they like so they removed access online access to years of historical crime data. And that may seem insignificant. I mean, all they did was remove an export button. But it kind of concerned us for a couple reasons. The first one was, you know, sure. They weren't legally required to have that and that you could still get that data through records requests. But it begs the question, you know, why take your take the time to remove that to remove that feature, which makes it way more accessible after it was already there? You know, I understand I think as as a lot of us probably do with public records, like institutions, not going out of their way to create documents that don't already exist or that aren't required. But in this case, you know, it was the feature was already there, and it made it way more accessible and then they removed it. And so it kind of it made us wonder like, Okay, what, what's the motivation behind that? And even more broadly, is it indicative of a kind of harmful trend away from transparency and realize you know, where we're kind of, this is in the context of us realizing this in the context of a newly appointed president, who, at that time, just like, was hard, was hard to get an interview with him president SAS. Luckily, that's changed this semester, which is awesome. But yeah, you know, this is just stuff we're thinking about like how can we respond to this? And so, one of the main things is, well, that's one thing is just the issue of kind of checking that transparency and not letting it escalate. The other reason is that this data, probably the bigger reason is that this data is really useful for for students, faculty, parents to make informed decisions about safety on campus. And so if I'm a parent, or student and I see an article that you know, Fraternity Row is a hotspot for sexual assault. Or in that's even true, especially for dorms that live nearby. I'm probably going to least try not to live there at the very least, you know if I mean I'm a guy but I have six sisters who a lot of whom have gone to UF. I'm probably not going to take a casual nighttime stroll through Fraternity Row on a Friday night, if there aren't any, especially when there weren't any emergency blue blue lights. And this is what's kind of interesting about this. This story in particular, is you can really directly connect the dots between the crime data and the analysis, and then the kind of positive change it can have when it comes to safety and just helping the public because this crime data is actually five years ago, they put this analysis out once students kind of caught wind of this and then realize that there weren't any of these emergency blue lights on Fraternity Row. They started protesting, and then the articles themselves when they were talking newspapers, they would they were citing this data specifically, they were pointing to this data. And then less than a month later, they got those emergency blue lights installed. And so that's just even on a small level or just on a campus level. That's the kind of impact this can have. And I that's kind of what we're thinking about when we're creating this kind of what we're trying to make it more transparent and then also just make it more accessible, accessible. Remove one more, kind of remove one more hurdle for reporters, so they can do this kind of reporting. Now, we were actually lucky even though they have removed the Export button. We just kind of this is a web scraping one on one technique. We luckily even though they removed the Export button, all the data was still being loaded in behind the scenes. They were still sending it from the server to the client. And so we were able to this public data was all legal. We're able to go into developer tools, network fetch and all of it was just sitting there for us to take and so we grabbed that and we fed it into a database. And we really retried were released last semester when we put this out. It's kind of a soft launch. We're trying to get it out there as soon as we could. But it's we're really working on ways to incorporate some of the visualizations from the lovely people over at flourish. That stuff was really inspiring and but the way it kind of works is we scrape the data God bought our own web servers is completely independent from us by the way we're all we do all this ourselves, bought a domain. We do currently use like a LAMP stack. That's all you need to know from that is like a Lin. It's MySQL PHP, kind of outdated. And kind of this semester, we're working on updating that to more modern, more modern technology, like React and Yeah. But yeah, it's very, it's all fully responsive, fully automated. And we there's a lot you can do with it. And I actually want to show a demo right now.
So you might be seeing this mean like wait, something's broken, why is there crime showing in Brazil or Greece? But that's actually totally right. Pretty much what happened is there was an international students abroad, an international fondling case. And that's the kind of stuff that maybe you won't be able to appreciate just in a datasheet but you can see on a map like what the heck is going on there. And we have a student working on that. story right now. Beyond that, you can see we were able to join is a lot of basic information like crime type, when it happened, where it happened. And we also joined court records using the report number and so now you can really easily if you see if you see something that catches your eye, you can super quickly pop that into the local court records and get that right away, get the arrest report and kind of see see what happened. On top of that, we also added the ability you can share like links of it. If you want to be like hey, this thing happened here. Check this out, like on Twitter, or whatever. We also and this is the main part is we made to where the data is. You can export the data in multiple formats and this is the main thing that's why we're really in a rush to get it out is because we want to just make sure like you can get this data and so you can search through it, you can sort it and it allows you know if I want to see like you know how how much crime is there at Weimer Hall, which is the Journalism College. I can look that up and I can export this and it's all regularly. It's like almost entirely automated, regularly scraping data every hour and feeding that into the map. And yeah, we have a this semester we're kind of working on integrating it into our local student newspaper and trying to continue to improve it, add more visualizations and so on. Part of what that will look like is more advanced filtering. I'm sorry, more advanced filtering. This is kind of like a prototype I just made. Whether it's through keyword crime type, we want to make some kind of heat map you'll see something similar in the flourish presentation. We also Yeah, more in more advanced charts, we really want to make it easier for you to kind of see the impact of how this data can manifest but aside from you of crime well the reason I share that is just because I think these kinds of databases are can be really helpful in not just helping like, like on campus like at a very small level. But even for talking about like local newspapers, I think this stuff is really important because it can save so much time instead of having to put all these records requests in. I kind of have this like, envision this utopia. It's kind of a pipe dream. But just like being like, we're only the only people that have to go to the government for records requests are the people maintaining these online databases and the people writing the stories, you don't have to worry about it. And so my passion is just like trying to get as many public records out of the government's hands and just online to where anyone can get it. Something I've been doing over the summer Kangen Water real quick. Thank you something I've been doing over the summer is maintaining which is probably the largest one of the largest databases of public records at UF certainly in the College of Journalism, we have voter records, vehicle registration, election data, arrests, bookings, court filings. This is the baby of Brandon Meyer and Ted brightest. They're awesome. They teach at UF I love them. And it's just so so useful because you it really it saves so much time and it really gives reporters an edge. Specifically I want to show I'm gonna pull it up right now. The I think probably the voter records are probably the most popular database that people use, because you can get contact information from there you can see voter history, probably the most famous Florida resident is Donald Trump. One moment and you know from that he's smart enough not to include his email and phone number but a lot of times you know, if you're trying to we had there was a story I covered last semester of a US football player who we didn't know at the time but died from a gunshot. Find it when finding out later was a self inflicted gunshot but we're calling around trying to talk to family talk to people who is roommates and were able to just right away but his name and put his family name and find find people who might know what's going on. The other thing that I noticed too is that seems to be okay with voting by mail. Yeah, there's a lot you can get from that. The other thing you could do with this is put in specific addresses. And so you know if I just want to specifically with that with the US football player, if I just want to see like okay, like I put in my old is my old address. I just wants the everyone that lives on this street. I can do that. Or you can actually put in Cardinal street conference Wi Fi might be annoying, but I can just see everyone who lives on a street I can talk to all of them. I can talk to neighbors, I can talk to roommates, and that's how we can get in contact with some of these people. I can't cover all these databases today. But they're just so useful. One of the main ones we use a lot to you can look up license plates and we that was one of the things we use to talk to one of the family members for that story. The arrest database can be used for we have student journalists who are looking at this every day in the breaking news class and just looking for any kind of outliers any interesting stories. And we also have Letcher county criminal cases this is so useful. You can just look up again I don't know if conference Wi Fi is going to let me but you can look up specific like something that's really useful is you can look up like attorneys like high profile attorneys who who generally work with high profile clients and they can be like a gateway into some really interesting stories. Something else is we we also track local government agencies and schools and specifically we'll, we'll try we'll scan them for names that match. Representatives, students, or teachers. If we see a name that's recognizable, we can start getting working on a story like that. I won't spend too much more time but I really believe in this stuff. I think you know, I do worry sometimes about just kind of the direction as local newspapers have kind of struggled to compete. I I think this is just one of the ways where we can kind of make it to where they can, they can still compete like I don't want with mainstream newspapers, because it just saves so much time. Money, gets it out of the hands of the government. Just where everyone can get it. And so I really enjoy doing this stuff. My This is actually my voter record. So if Yeah, if you have any questions, you can reach me there. And yeah, I mean, I really honestly I love working on these projects. I love what I do. So if it's an interesting enough project, I'd even be willing to help for free like I really just enjoyed doing it. So yeah, interesting.
I really like it. So thank you so much. You have any questions, please reach out. Thank you so much.
Exactly. So we do have some time for questions. I have the obligatory question, which is, are you guys using AI currently or do you plan to use a ai ai in any of your data analysis?
I actually think that AI can be really useful, especially in terms of like cleaning data. I don't trust AI as far as I can throw it in terms of like interpreting things out of data. That is just my opinion is that, like the most public facing AI that we see is built to sound correct and not be correct. And so I always sort of urge caution there, but then it does a whole lot of things and there's a whole lot of potential. So I think the answer to that is going to be yes. Just how we use it.
I think on the code coding front too. There's there's a lot of potential. I know somebody on our staff uses. I think it's copilot. I forget what the tool is exactly. He's basically like generates the code for him and he can kind of you know, query it to do that. So for folks whose might have a little bit higher barrier for entry for thank you whoever turn that off. Who may have a higher barrier for entry for for coding, it is a it is a good way in that being said you still at some level, like there's often times where it'll spit out some code that doesn't work and like if you don't know what you're doing, you might not be able to troubleshoot it very well. So, you know, I think for now, it's hard to say exactly how this is going to evolve. But for now, it's definitely not like a kid you can leave at home by itself. You definitely have to, you know, really kind of parented along a bit and guide it
Oh, definitely, um, when it comes to AI, definitely Google pinpoint if you're not using that super, super helpful, you can just throw a bunch of PDFs in there, and you can keyword search them, so definitely use that it's really useful.
Thank you. Question here.
Thank you. Can you is this working? Okay, um, anybody can answer this but Steven, something you said had kind of sparked this question about collaboration between writers and people on the data side. Do you have any tips about like, what that really how that relationship can work very well. Maybe even an anecdote about working with a writer that you could share?
Yeah, I mean, I think the number one rule is I am also a journalist. I think that there are I think a lot of there's sort of antiquated views on what makes a reporter what makes a journalist within newsrooms, I know that the folks at this conference, sort of end up falling on my side of this, but you know, treat me like a partner. I am here to help I can do reporting, I can do all of these things. And I don't love being brought in late to projects because I can't do good work. And so either bring me in when you were starting and I'm like, here's the thing. Like, I feel like I've gained a reputation as like a Doctor No, like, I am the guy that goes out there and is like this is not possible. And, you know, so I think some reporters don't like that, but they also like not being wrong. So I think that having a candid relationship and like I tried to have candid relationships with the people that I that I do that I that I do this work with, I work best with the people who see me as a partner in this and are and you know, put me on the byline do things like that. And so I yeah, I think that the relationships work when we're partners in reporting, because my data can inform your your reporting and your reporting, kind of where my data
is here. Or there. I'm actually not a journalist. I'm a Communications Director for a nonprofit organization. I'm here to understand how to talk to all of you journalists. So we produce a lot of our own data, especially on child welfare and education data. How do you as data journalist, view the data that comes out of nonprofit organizations or like a third party source?
I think it's very context specific, because oftentimes, if it's, I know a lot of reporters even in our own newsroom, like they'll start with those datasets often because they're much much more easily accessible. And there's a fine line to strike between like having basically providing what is essentially like government data or some sort of other data with like some sort of value add like analysis that is also still, you know, granular enough so that we can do our own kind of like vetting of the information. So I think it's very context specific. Obviously, the more original the work is that you all are doing the like more we'd have to rely on that date if you're the only ones collecting it if it's not actually public data. And it is, you know, representative or it's comprehensive in some form. That would definitely be an avenue for that. I think typically, because there's never been a ProPublica story that's taken less than two months. We usually if there if it is like a public data set, even if we start with like a nonprofits kind of slice of it, we end up going to the, to the raw stuff, but like I'm working on a story right now. From the Environmental Working Group, where they you know, they collect the tap water database and the value add that they had there as they did all the state level. Like records requests, and that's 50 different records request that I don't think we have the time or inclination to do. So like that's an example of where we would use that kind of like data.
The two questions that I will that I would ask if I'm getting into nonprofit data or data that's not collected by the government, although I do ask these of the government as well, is how was the data collected? And why was the data collected because a lot of nonprofits collect data to make a point, even if that point is not true, so they may leave things out. They may do all of that. So I'm always approached skeptically and just sort of try to I immerse myself in the methodology of those things, because it can tell us I mean, like, good data can come from anywhere and so you know, it's just you got to be able to, I like to focus on the methodology.
Now just be prepared for the skepticism and I think just take it like head on also.
Second, oh, I was just gonna say, I mean, kind of going back to some of the stuff they said earlier about interviewing the data, just like you would, you know, if you're covering some kind of political issue, you're gonna interview both sides of that. Same with data you get from nonprofits, you know, if you're gonna get data from one nonprofit, that's on one side of the issue, get data from another nonprofit, tell the other side of the issue, see how they compare like, just make sure you're being asked to be mindful that stuff when you're sourcing data.
So, you alluded to this in your last answer, but I'm really interested in how long each of those projects you presented took. And, you know, how do you get newsroom buy in? For something that might not show anything for possibly a few months?
I'm not well positioned to answer that question at the last second. Part of that, I will tell you how long it took. We started reporting on this in the summer of 2021. And we published our first story in October 2022. So more than a year it wasn't concerted effort the whole time. I would say we started really really reporting and working on it for maybe like 778 months, 67678 months, something like that through the end of the year. So yeah, it was multi month.
I mean, technically my project took two years but that was because I filed the boys and then just let them sit for a while. Which is how I do a lot of projects. I get interested in something and know that the FOIA data is not going to come in quickly. So I sort of let them sit and then they come back and so probably 14 months after I filed it was when I actually started beginning the project in earnest so it took about 10 months but I also am not. I don't work on one thing at a time. I work on a jillion things at a time. So it didn't take 10 months of my time, full time. In terms of getting buy in for things. I mean, a lot of what I do is a lot of pre reporting on this so that we can give like an after like two weeks of reporting, we can give a minimum maximum story based on what we have. Because I can't if if if there's not a good minimum story there are editors don't want us to spend all the time in the world on this because we could go down the path of spending several months and end up nowhere. So we sort of make those that calculus and if there's a good minimum story, we'll pursue it. But you know, I also am the person working on the dumbest long term things that you can think of like I'm in year nine of our project on a fleet of fatal police shootings. I'm in year eight of a project on opioids, like I, you know, you chunk it away and you make it a big project by doing things slowly and over a long period of time.
Yeah, I mean, they're gonna have a lot more experienced to speak for a lot more wisdom there, but specifically with the projects I talked about, the UFT crime website was, I mean, I pretty much took it like a class. Not action on officially but I approached it like that, because one of the main reasons I wanted to do it was also to just learn some bet like get better. At back end development, and really just build something from the ground up, stole all libraries and do all go through all that. So that was that probably took like a little under three months in just while juggling other classes. And work and stuff like that. But yeah. Hi,
I started one of my investigation was the approach of big to small, so I analyzed like 1200 cases in Egypt and we have done a lot of work on that databases, and so often that when it comes to working on the ground, I didn't get much information because victims are afraid to talk to journalists because of a political situation in Egypt. So at that point, when you decide to gather or to move forward in your investigation and give it more time.
I think ultimately, there could be a story just on the analysis depending on what your your bar is, I think the main thing is making sure that your analysis is accurate, right like that. There's, there's not some component of this that's missing. And a big part of kind of the beauty of the journalistic process when it comes to like data reporting, is that you're forced to do that by finding the people in the field that were actually affected right and seeing what that actually means on the ground. I'd say keep keep plugging away at it. I think I've been lucky and or blessed in my career to work with reporters who are fairly like well sourced in the communities that we're discussing. So like finding the different levers you can whether that's local groups that might have connections, you know, in that case, since you have a compelling reason, potentially offering, you know, confidentiality to some extent to at least get your foot in the door and understand what's going on. I think there's ways into that, but I think the main thing you'd want at the end of the day, we don't want to put anything out that isn't true, right. So like once you've met figure out how to meet that bar. And then if you can, then there might be different versions of the story you can do. If you can't, then it might be a little bit tougher to figure out a path forward.
So if you have like, you read the files and the cases and you have stories, and files, so you can use those stories and instead of people real people or not.
What kind of files are these are cases.
It's like a Supreme Court in Egypt. So it's the kind of when judges use some articles to violate the load to make the sentences shorter.
Yeah, I mean, certainly a lot of the a lot of investigative work you can use like primary source type documents like that, to quote from I think, usually it's still want to talk to the people or at least make an effort to talk to the people behind it, but it's not always possible, obviously. But yeah, you can theoretically depending on how your your story is set up.
Thank you. Well, that concludes the first part of our workshop. Please join me in thanking the panel this afternoon.
So we now turn to the hands on portion of the workshop. With flourish. The world's leading data visualization and storytelling tool, flourish as part of the Canva family and features a vast library of templates empowering users to create interactive charts, and unique visualizations that are interactive, easy to embed and require no coding. Today's presenters from flourish are Louisa beador. And Maria Fernanda coaches home. Sorry, let me try one more time. Coach at home. All right. Linda is a product manager at flourish where she drives the development of new visualization, templates and features she works closely with journalists and organizations to create compelling data narratives. Maria is a data visualization specialist at flourish. Our Her work focuses on teaching data visualization and best practices to broad audiences through different formats. She has worked in newsrooms in Venezuela, the US and in Spain. Take a quick minute to make the transition and then we'll start the second part
all the music's back
I tried to you said I was the only one. No one likes be nice. To you made this mess and left me with the pieces now I want to burn all the bridges between
need some history. You don't have to appeal. You made this mess piece. I'm gonna burn all the bridges. Between.
It switch to the other. Yeah. Nice and then I need to go make it maximized.
Note that's for center view. So we actually need to drag it backwards.
A Oh my god. A little bit higher. Higher up there. Yeah. Okay. We're sending out some errors. You see where I put them
all right. Where are the presenter notes
Okay. I'll let you start. Okay. Hello, everybody. If you're wondering why we're not standing up sexually because I am very short and that podium eats me up whole. So we're gonna be sitting down here. Sorry,
Hold the mic close.
Sure. Is this better? Yeah, okay. Amazing. Well, welcome everybody to our session beyond words mastering data visualization for impactful reporting. This is going to be where we try to make it a continuation as fluid as possible with the first half of the session. So we're going to be more focused on database specifically, and more importantly, why you may want to include it in your reporting. If you're working in a newsroom, or as a freelance or independent organization, why are charts important and how they can help you tell your story better. So, yeah, well, hello, everybody. Glad to see you here today. And my name is different Monica Yukon. I'm a data vis specialist at flourish. Yeah, Randy already introduced this, but both Lisa and I have a background in journalism. So we're not only talking from a product perspective or an industry perspective, but also from experience like working with charts. I started with infographics and data visualization around 2018 and it was my passion. So I really do believe in like the power of charts and visuals to communicate.
And I'm Louisa I work as a product manager at flourish. And yeah, as my friend mentioned, also have a background in journalism and studied data and digital journalism in my master's which led me to work at flourish eventually via a couple of newsrooms. And yeah, super excited to be here and to be talking about the power of databases and data storytelling today. Before we dive in a few housekeeping bits. If you already attended our canvas session earlier today, you might have seen this slide already. But basically, we would love it if you could keep your questions until the end. We will be around for the rest of the conference. You can find us at the juice stand out there. And we also have a fun quiz that you can do while at the juice stand around which chart type you are which chart type matches your personality. So come find out tell us what you are. We'd love to see. So we thought we would start by taking a look at the attendees of this conference. Visually through the power of flourish. So we got some data from Hannah around the attendees based on the survey that you all filled out when signing up. And we have over 1600 people here from over 30 countries like Canada, the UK which would be us obviously most of you are here from the US as we can see here. And if we dive into the US in a bit more depth, we can see that it's mostly the East Coast that's represented but also California.
Yeah, and we have a lot of first timers but actually most of you are Oh and a veterans who've been here more than once or even 17 times and amongst a few your mass over 1000 years of experience. So there's a lot of knowledge, a wealth of knowledge to be untapped after this session. So do connect. Chad also chat to us. We're super happy to to connect with her this session. But without further ado, it's time to get started with some facts. And you're all journalists. We're all journalists. So we enjoy this and I thought it would be a nice way to get us started. So Fact number one is that we are visual creatures and when I what I mean by this is that the way in which we consume the world is predominantly visual. And 65% of the general population are visual learners. This is based on a study and that just means that we process information the best by visual stimuli, so images, sometimes stacks or, you know different hierarchies with the text charts as well. But more specifically, we process information over 60,000 times faster via images over than text. And this is just a setup to the idea that we're living in a visual economy, the way again in which we experience the world. The things that we produce are mostly visual, the things that we consume are mostly visual. And this also means that we communicate effectively, predominantly in a visual manner. So we are also not only in a visual economy but also in a visual culture.
The fact number two that we brought along with us today is that we are surrounded by data, everything from the weather on your apps to the steps that are being counted by your Apple Watch to your social media likes but also the number of conference attendees or numbers on police officers like we heard about earlier. Data is everywhere. To give you a bit of an idea how much data is everywhere. It is actually 175 zettabytes worth of information by 2025 that we will see which is a lot one zettabyte is a trillion gigabytes, so it's kind of hard to put that into perspective. But if you tried, you could imagine it would take almost 2 billion years to download at all. So lots of data. Obviously not all of it worth a story but a lot to get stuck into for you as journalists, so massive opportunity for journalism. But it's important to understand how data can be used. We see it twofold as a source and as the story itself. So you can either use data to backup an existing story. Kind of go beyond traditional reporting methods and rich it adds some more context, or data can also be the story itself, where you would not have been able to find the story if you didn't have access to a large data set. And you can actually only spot those trends by having access to that data.
And as Lisa mentioned, with the abundance of data that there isn't those two approaches. There's also some questions to ask when you're working with data. And at this point, and with the first half of the session, I assume that you are all going to start working with data if you're not already doing that. We've split it into four main questions. So first question that you should ask yourself is how can you use this data? Right? It's like how can this data actually serve you? How can you put it into context meaning information for that matter, not just data, but like any information without context really is not that worth it? You really need to make sure that you make it understandable for others. And do you actually focus on it from a specific angle, make sure that it makes sense for other people, then how you can make it actionable meaning how can you actually make sure that that data that you're putting out into the world or data you're working with ends up having some value some return for you to make it useful in any way? And then how can you make it engaging? We can be talking hours and hours about charts. data journalism, but if we don't make it appealing to other people to the general population, if we don't make sure that the data we're presenting sexually interesting for others, we're really failing at communicating with others and that's really what we're here to do. So, one, there are many ways to answer these four questions, but the angle that we're gonna focus on today, it's data visualization and data storytelling. So giving meaning to data through data visualization, and what data visualization is, this is like the textbook definition. It's just the graphical representation of information and data. The visuals help us identify patterns, outliers. And surely to others there both communication and exploration tools, and I'm going to explain this further down, but keep that idea in mind. Now, data vis is also a tool because it allows you to achieve a specific result. It is a scale because it can be taught, it can be learned. You have to practice it it can get rusty, if you don't do it, you can forget it. Sometimes you can pick it back up, but you can build upon it and you can actually learn it. It's a language because even though if you don't work with charts right now, or if you see a chart, and you're very confused by it. They're not arbitrary. The rules that govern charts and the basis of data visualization are actually quite stiff in a way. So once you learn them, you're able to produce charts over and over again. And it's just like a language like the grammar. And my favorite one is the database. It's also a superpower because it allows you to visualize your data or see your data from a different perspective to understand issues differently and I think it was Stephen had mentioned this in his part of the session, but not a lot of people are doing visualization. Right? So it's actually something that can differentiate you as a reporter as a journalist, that can be the X factor that separates you from other people in your field. If you're still not convinced, we also have more information around why you wait, you may want to use charts in your recording.
One of those reasons is high data density. So a single chart can encapsulate vast amounts of information. You've probably heard the saying a picture is worth 1000 words. Well in the world of database, it's probably more like a million. Quick question or quick show of hands. How many of you are familiar with Hans Rosling? Because I want to show him as an example. 123. Okay, a couple or more than a couple, but we are and we're big fans, and we encourage you to check him out. But um, he's basically sorry, Doctor and professor from Sweden who's really passionate about data and the use of data and data vis to talk about development issues and economic progress. And he popularized a an animated scatterplot super similar to this one, we just recreated it using flourish. And what this chart shows is country's GDP plotted against their life expectancy over a number of years and the countries are shaded by region and they are sized by population. And what this chart tells us is that countries when countries economies improved, so did their life expectancy, and we can see this beautiful trend of all the countries in the world heading towards the top right corner. And what's so brilliant about this chart and why I think it's a great example of that first point of high data density is when you think about how much data is going into this one visualization that we are looking at, in these 30 seconds that it's playing. It's quite insane. It's like life expectancy over loads. of years. Its GDP, its population, its regional. It's so much information that thinking of like trying to distill that into an article would be quite difficult or giving people an a spreadsheet of this information will be quite difficult to grasp the story as well.
Second reason why you may want to use charts in your reporting. Is because they actually drive engagement up. So some of you will be very excited to actually have a figure around the topic. But according to the book, the infographic a history of data, graphics and news and communications by Dick Murray, a great read. news stories that use infographics have 30 times more page views than stories that don't. And this book was published in 2020, if I'm not mistaken, and I believe the research that he's citing here, it's actually a bit earlier than that. So after 2020 We got to COVID got a surge of charts in the front pages we all were dealing with, you know flattening the curve or ascending curves and logarithmic scale so we were all much more exposed to charts than ever before. So one can only think that this figure is much more higher now than it is that it was done back then. And in a world with like shorter attention spans, I believe we have a short attention span like a goldfish, it's really important to know how to not only catch people's attention but retain it so good to know that charts are really like strong mechanism to achieve this.
Third reason to incorporate charts or data visualization in your reporting is that they appeal to our visual nature, as we already touched on images and graphics are often quicker to be understood. And they can both inform and captivate your audiences. So again, quick example, here's a data set. We can maybe discern some trends or not really quite trends we can compare individual numbers between countries between years, we can't do too much at once. We can't see any bigger picture trends. If we take this and put it into a visualization. We can see how this kind of speaks to that visual wiring of our brain. We can see an actual story taking place. We can use color and shape and legends and texts to inform that story and to guide our audience and help them take away the key message of our chart which in this case is that five countries received almost half of the world's asylum requests in 2022. Something that would have been really difficult to see if we just had this at a first glance, at least.
And the fourth reason why you may want to use charts and you're reporting include them in your practice is that charts have authority. Now this is going to be probably the most technical bit of the whole talk, so bear with us for a sec, but in 2017 I had a Reifler conducted a study in which they were trying to understand whether people could change their beliefs once they were based on misperceptions. So I'm not talking about subjective beliefs, but rather when somebody actively believe something that was factually mistaken right for instance, that climate change is false. Now, throughout their studies, they research how different mechanisms to deliver correct information so that if somebody believes something that's false, you try to convince them of the reality by delivering the factual information. And they tested text and charts and they found out that delivering correct information in graphical form, successfully decreases reported misperceptions and that a graph reduces misperceptions more than equivalent text. So in plain English, the charts are actually more effective than text in changing people's minds, but also in correcting misinformation. Effectively, which is something incredibly important to take into consideration going into an electoral year, but also living in an age of misinformation and fake news and, you know, all the deep fakes that we're all experiencing. So understanding how you can actually change people's minds for the better. It's a really powerful mechanism and there is this is one of the charts that they included in their study. And let's I'm gonna break this down very quickly, but I think it's a really, really powerful example, so bear with for a sec. On their third study, they again, delivered the statement global temperatures have decreased or stayed the same. Over time. They tested this with self identified Republicans here in the US. And in the chart to the left, you have the group that identified as not strong GOP, so maybe more in the light side of Republicans, and then to the right, more strong, strongly identified Republicans. The control group is the people that didn't get corrected information. So they only got the statement. And the higher the board means the higher the the proportion of participants that believed the statement or agreed with the statement, and you can see how those bars and I can just quickly use my cursor. You can see how these bars are just higher over here. Now the second group is the people that actually got information corrected by a text. So information and data that actually showcase that temperatures have been increasing over time. We all know that climate change hopefully it's not something to be contested, but to be understood at this point. And you can see how those bars are actually shorter than the previous ones, which means that there was some sort of correction correction in the beliefs but not that much. And the third group is actually the people that have got the corrected information in chart form. And those are the shortest bars of older groups, meaning that this is the people that actually changed their minds or disagreed with the statement the most. So again, this is just the synthesis and that idea that charts are really powerful mechanisms to influence people to change people's minds, and to deliver facts to the general population. Now, up until now, we've only discussed why charts are great and I mean all the jazz but now it's a question of how can they actually help you certainly steal your story. And in a nutshell, data visualization allows you to transform your data really complex spreadsheets, into insightful visualizations like this one on the screen. They just allow you to really distill all that information into a mechanism that is very engaging for general audiences. More so than that. charts allow you to explore your data because you're able to simply plot it and understand what it's actually telling you, right, like, what are those numbers telling you effectively, then to discover patterns that you may not be able to see from the get go, as Lisa mentioned, and lastly, to communicate charts are really effective ways to showcase bias in amounts of data to people. But communication is not always the end goal, meaning that you're not always going to showcase your charts to others. And much like in other areas of journalism, you will have to kill your darlings in data visualization to you're going to work with a bunch of charts, you're going to create a bunch of charts, hopefully, and you're going to have to discard lots of them in favor of the one that actually conveys the most important insight. And I think you could see it with the first half of the session that there weren't really that many visualizations, but the ones that were used were the most compelling ones that were delivering the main message. And when you started working with charts in the newsroom, hopefully this is what it's going to look like roughly, you're going to find some data that you find interesting. You're going to want to analyze it to understand it better and then visualization is going to come in as any other tool in your toolbox, right? It's just another mechanism to do your reporting to understand your interviewer data. And maybe you plot it into a chart to understand it better see trends over time. And in that process of visualizing, you may find a different angle you may find one of those hidden patterns something you were not able to see before that interests you. And maybe you pursue that angle further or maybe you discard that angle that really depends on you your story and what you're doing. But in the end, it's just rinse and repeat. Like this cycle can go on and on and on until you stop it. And the point is that in the end you just have to like learn how to follow all these steps. Create your charts and learn when it's good time to just showcase them to the world. Now hopefully the question about whether you should use charts in the newsroom. It's no longer an IF but a win. So you're asking yourself when can you get started?
And we hope that the answer is now you want to get started now straight away. And before you get started though, let's take a quick walk down memory lane. Back to the days where design and also databases was complicated and hard. Tools did not really speak to one another you used one for photo editing, one for image processing, one for vector design, etc. Data visualization was no different. It had high entry barriers and you had to be a bit of a coding wizard to create anything worth showing or understanding. Thankfully, those days are behind us because we bring to you if it loads Canva and flourish. We've lowered the entry barriers to empower anyone to get started with data vis and data storytelling. And in just a few clicks so making it super easy for anyone to create visualizations with their data. Getting started with flourish is super easy. You can select one of our templates from our template chooser. We have hundreds of different starting points across about 30 different visualization templates. You can upload your data and customize the look and feel of your chart. And then you can publish it and embed it on your website or in your CMS or in a Canva design. Our mission is to democratize design. And data storytelling. And through that we have risen to become the world's leading data storytelling platform, empowering 1.5 million users that have created more than 15 million visualizations that have amassed over 32 billion views, which is largely thanks to organizations like yours newsrooms that have large viewer ships coming and engaging and looking at those visualizations. flourish is designed to be incredibly versatile, so that we have something for everyone from a complete novice to a more advanced user. And I wanted to kind of demonstrate this on an example with a newsroom that we work with at flourish. The example is tortoise they're a UK based news organization. They focus on slow news. So kind of information behind the main headlines. And they became a flourish customer a couple years ago and I really liked the way that they use flourish because they use it for for very basic day to day things for some a bit more advanced and then for some super advanced things and that's really just showcasing how kind of broad our offering is as a tool. So on the kind of no code end of the spectrum. We have their usage where they create daily charts for their homepage, and most days you'll be able to find one. Here's an example where they used our area chart template to visualize the number of refugees fleeing Sudan. This chart could literally be created in five minutes or less. They click on a starting point upload their data because they're on a business plan. It already comes in their colors, fonts, etc. They hit Publish they're done. This is something that their team can do quite easily and quickly. Then on the a bit more involved. Part of the spectrum. We have this low code example where they created this in depth scrolling telling piece where they are visualizing the results of their global AI index which they collect data on themselves. And they're explaining the results of that through this really engaging piece where as you scroll through, you kind of get parts of the charts explained to you and you can contextualize them and it's like someone's kind of telling you what to look at and how to understand it. And squirrely telling is also something that we offer as a tool, but it can be a bit more involved and this whole piece is a bit more of a production and then on the most advanced end of the spectrum, we have projects like the Westminster accounts, which is where tortoise teamed up with Sky News to collate, analyze, and put together data from various sources to kind of try and answer the question. Where is the money coming from in British politics that was splintered across loads of different data sources. So they had a huge amount of work on the data side. And then when it came to visualizing it, they really wanted to have a visualization type that was specific to this story. So they designed and coded that plugged it into a flourish template which allowed them to repurpose it. So we also have a software development kit that if you do have developers, they can actually build specific templates for your newsroom. To sum that up, here's a nice quote from Katie who's tortoises Data Editor. flourish is an integral part of how we present our journalism to the world. Everything we do from simple everyday charts, interactive maps to completely custom visual storytelling pieces would be a lot harder without it. And tortoise is just one of many many newsrooms around the world using flourish. Here are a couple of more logos and I would love to hear from some of you if you've had any experience with flourish. So I hope maybe we can continue that conversation at the juice bar.
Okay, Colin, can we Yeah, now it is time to go into the more hands on part of the session. And I mean, you may have heard the saying show, don't tell. So let's actually bring some data to life right now. Now, this is a quick recap. Zachary, very kindly lent us his data from the UFT crime project. And he just sent me the XML file with like the raw data, and we're going to visualize it live on the stage right now. So as a very, very quick recap. This is grants reported at the University of Florida Campus between 2016 and 2023. There's some partial data right there, but nothing to worry about too much right now. And some of the variables that we're going to visualize or see in our data are the type of crime the location of the crime and the date. So, again, because the goal of this and you can tell I was actually looking for the Google pinpoints tab on his session, so I'll take a look at that later. But the goal of this part of the session is not so much to go in depth in the analysis. That is something that you as reporters will have to do and that is not what we're here to, to show you, but rather how you can bring that data to life using flourish. So first things first. Welcome to flourish if you've never checked it out. It's the name of our tool, the data visualization tool from Canva. And you can access it by going to flourish that Studio online. And I'm simply going to go and signing already signed into my account, or should I? Well done, and this is going to be your project speech. So this is where all your charts are gonna live. When you start visualizing. If you are just new are creating an account, this is probably going to be empty. In my case, I just have a bunch of folders and a bunch of projects right here. I'm just going to go to my UFL folder right here for this particular demo, and I'm going to hit new visualization a bit of a spoiler there with the charts. But now that takes us to the template chooser. So the template chooser is the beating heart of flourish because this is where all of our starting points exist. This is all the possible chart types that you can create with our tool and really is just as simple as clicking on one of the starting points and plugging in your data. So I'm just going to quickly show you all the possibilities that we have. Over here we have everything from electoral charts, sports templates, pictograms, different options for maps, racing line charts, racing bar charts. And so on and so forth. Now, just for a second back to my deck, we ask yourself questions in order to understand or to know, just quickly hide this Yeah. There we go. Questions to be able to know what we wanted to visualize. So the question number one was what is the most common truck crime type? On campus? And this is what we call a magnetron question because basically what we want to know is on the data set, how many crimes were there on type A, type B, type C, and then which one was the predominant one? So after running some analysis, we just got a quick list of the top crimes right here. So whenever you're going to flourish, I can just take a look at this. And of course, once you're working, as I said, like did with PlayStation is a scale it's a muscle you work it and this process is going to be super quick in the future. But once I looked at my data, and I know my question is what question it's a magnitude question. I know that I want to visualize this in a bar chart format. So back into flourish, I just go in and I select the starting point that I want, in my case is a bar chart. And you're first presented if the Wi Fi allows There we go. With a visualization that has pre loaded data. And we always do this so that anybody creating a chart from scratch is able to see what the final product roughly looks like. Obviously, with your data and with my data right now, it's going to look a bit different but this should give you like a good guideline of what you're going to end up with. So very quickly, I'm just going to go to my data tab and this is where the data is being pulled from. And I can either upload a CSV, XML file anything by clicking on this button, or I can just copy and paste it from my Google Sheet and that is exactly what I'm gonna do. I'm just gonna grab all of this command C and just paste it right here. Clear it and paste. And that's it. And here in my little preview window in a second. Again, Wi Fi, it's a bit it's struggling a little bit. There we go. That'll really change from what it used to have to what my data is now reflected. And here we have our data bindings. This is simply how the data is communicating to the different settings of the tool. And it's speaking to how it's going to interpret each of the columns and rows of my data set and how it's going to plot them into the chart. So if I go to my preview, I already have a bar chart that's sorted from greatest to smallest in this case, and we can see how larceny is the most prominent crime type on canvas. Now, this is a very rough graph, but with few styling decisions that I'm going to play right now we can make that a bit better. I'm going to hide this legend because we don't need it. I'm going to quickly make some tweaks to the axes. And again, because I've run this demo like 1000 times in my brain already know what I want. This might take you a bit longer the first time you're charting with it, but it's just to show how quickly this this process really is. And now this is looking okay. But if you were on a newsroom organization, if you were with flourish, you would be able to apply just your personal theme. It actually would be applied automatically to your to your charts. Sorry, but for this one, I just created a specific theme for this particular chart. We can see how that's pulling my colors, my font and my logo. And I can just do a little bit more of styling. I can do some very, very quick editing here just to bring this chart even further. And this is all gonna make sense in a second. Like that. I made a couple of edits. And now we have a much more clear story right I'm highlighting the main bar which is the topic of the story I'm trying to tell I have everything else grayed out so there's not it's not too overpowering. I can add labels to my data points. So it's a bit more legible, easier on the eyes for my readers. And lastly I can add a quick header show some says larceny most common crime type on campus. And I can I can turn some of our nice features that highlight legend elements are not labels but legend legend elements to match the same color on my dataset. Now, I did this very quickly. I appreciate that a lot of you are not going to be able to get the same result in as little time but in the interest of time I had to but this is just to show just how quickly and how easily you can actually do this in your own time for your own projects. The last thing I need to do is simply add a name to this. I'm just going to call it demo. And I just hit Export and publish and I'm ready to go once I hit publish, I either can share once when it loads. There we go the public URL which is the same look and feel of my chart. And this is a link that you can share with anybody you're interested in showing your chart too. But I also have my embed code and this is what you want to put in your CMS and your website wherever you want. Now, even better is the fact that flourish is seamlessly embedded in Canva. And once you're in your Canva presentation or even your Canva website, all you have to do is go to the apps, you type flourish. You select the flourish app, and this is automatically going to be connected to your account. And once it loads, you're gonna see all your projects right here. And here. I can see my demo. So I click on the demo. And that is now embedded. There we go into Canva. And it keeps all the interactivity. Again, fully flexible. I can resize it to much the size of my screen or my dimensions and if I double click on it, I have all the interactivity that I had on flourished right on a camera presentation. Super easy. And we're moving on to question number two.
Nice, thank you my friend. That was fantastic. So as the next question we ask ourselves Are there any trends over time visible in this crime data? So we decided to visualize it as a line chart or an area chart, because that's a classic chart type choice for time series data, because it allows us to kind of see the shape has something been going up or down. We decided to focus on larcenies again, because in our previous chart, those were the highest reported crime type, and different to the chart that we saw before which was larcenies are the most reported crime type. We can actually see now in this chart, a different angle. It's slightly different. Story, which says larcenies may be the most reported crime type, but actually, they're going down. So this is just to demonstrate how depending on the chart type choice, you might be telling a completely different story. So it always depends what is your story, to choose your chart type based on that? And we'll be sharing some resources on like how to make those decisions at the end of the session. But um, to break this down even more, we decided to animate in some other crime types, some other categories just to see how those compare to larceny, so that we can contextualize that a bit and have a bit of an understanding. How does something like fraud compare? And how has that been going up or down? It's a bit hard to see these other lines though, because they are all quite small in magnitude compared to larceny. So on this next view, we're actually looking at a percentage, a stack percentage area chart which lets us see the kind of makeup of the crime types as part of a whole. So again, depending on the story that you're trying to say Tao, you might choose a slightly different chart type. In this one, we can see the percentage of larcenies has also been going down and other crimes are actually taking up more of the current crime happening compared to last year for example. Another story you might want to tell is on each crime type individually, we could break this out into a grid of charts. We adjusted the y axis here to be unique to each of the series so that we can actually see the trend in the individual category. So larceny is going down narcotics violations are actually going up. Something that was difficult to see in this view, because of the way that narcotics violation is just quite small. And then we might decide to highlight those that are going up and slightly fade out that those that are going down. So as you can see five completely different charts with exactly the same data kind of five different angles, five different stories completely depends on what you want to tell. And the way that we combine these by the way is with a flourish story. We offer this feature so that you can kind of stitch together different views of a visualization or different visualizations all together and walk through them step by step, or also told turn them into one of those squirrely telling pieces that we saw. So that was question number two. We have one more question I believe to get through.
Thanks, Lisa. And lastly, as I mentioned, we also had location of the crime so we had coordinates to exactly pinpoint where a pride had been reported. And with that in mind, we ask ourselves, where do most crimes happen? And again, emphasis on the fact that we didn't analyze the data we just were like exploring it and trying to see it visualized. So in this case, maps were a very useful tool for us to understand what was happening and flourish. We do have like a really wide array of map options. And in this case, we made we build a heat maps you understand how hard crimes were hate crimes actually happened over time in different areas of campus and we're able to zoom in and just interact with it and see some of the trends that were mentioned. For instance, I believe this is like fraternity road and you can see how it becomes sort of a hotspot but also how other areas are also kind of like hot zones on campus, at least according to data set. And one of the things that I noticed was that the area surrounded Shands teaching hospital were constantly like on fire in terms of reports. So again, if I were a journalist and I got this data set, and I didn't know much of it, but I build this map really quickly just to like see underlying trends. I'd be interested in understanding what's happening around here, right like what's going on in in the surrounding areas of the hospital. Why is it constantly getting reports like maybe there's not a story there but just like being able to see it plotted in this way gives you some perspective and makes you ask different questions, perhaps that if you were only looking at the data, other different type of map that we build here, we added some annotations to pinpoint different locations that might be relevant. But here we just plotted the different locations of reported crimes on the top six most popular crime types on the dataset. And with this interactive legend, you're able to select one at a time and understand them a bit further. Once again, we chose larcenies because they weren't the most popular crime types. So just to keep the trend ongoing, but you're able to see hotspot lie again. I again the neighborhood where sororities are or fraternity road. Also some bread is residence halls and private I think student private accommodations. And then one interesting insight that we found with the liquor and law violation crime type was that they were the single crime type that saw quite a hotspot of cases at the edge of campus, which is where restaurants and bars are. Again, this is not a surprising insight. I'm not saying that we've discovered anything you know, shocking like you're seeing reports on where pubs and bars are around campus. But it is the power to see it plotted on a map right is like going from assumption or just belief into actual factual confirmation by the visualization. And on top of that, not only is this map fully customizable and fully interactive, each of the dots has the information that was on the data set. So if that data set had had more data around, like perhaps the person involved in the crime or so on and so forth, this would all be in this interactive puppets pop ups which are fully custom customizable, within flourish.
And that is hopefully enough to inspire you to get started with flourish or with your database tool of choice. It doesn't have to be flourished, but we hope that we sparked that, that interest and that you will start visualizing your data in your newsroom. And yeah, come find us at the newsstand. I think the next slide is a picture of us where I removed the background using Canva. They have a great background remover feature. And we'd love to hear more about your use cases and help us flourish in Canva.
But as promised, we do have some time for questions. So feel free to head to the microphones and Yeah, happy to to answer any questions.
Hi, my name is Ken. I worked for RFA the Vietnamese surface up RFA I've been using flourish for around six months now. And so one thing that popped up during like my time working on my previous project was I use the map function up the flourish. And so I had to I had to make a custom map using like geo JSON. So that was like another big learning curve. And I felt like at the time if I had like a preset, or at least access to other people's work before me who you've or like who made the exact same type of map that I was working with. It would have really helped so is there any plan in the future to flourish to create like a community hub for people to put their previous work into what Julio who are willing to share their previous work?
Yeah, you are picking up on a big pain point for many of our users, which is our maps and that you often have to upload your own geo JSON. We are working on making that a lot more intuitive and basically just giving hundreds of maps and starting points. At the minute there's just about 20 that should be coming very soon. That's like currently being worked on but we have the option to add more starting points to maps. In the meantime, we do have a site where we upload all the geo JSON resources that we've currently that we've used or that we know our users have used. Oh, we've cleaned them up. We're focused, I think it's first art studio slash data sets. Okay. If that's not it, come find us at the juice stand and we will help you find it when you have no if they're missing as well.
Yeah, we do have resources in the in the slide deck and we're gonna share that with you and that's one of the resources that's there.
Now another minor, minor point is correct me if I'm wrong if I'm wrong, but currently do we have the ability to embed the link up the source into like the source text? Like for example, if I put forth new york times do we have to be ability to embed that link yet?
Yeah. You mean like make it clickable? Yeah, yeah. Okay, gotcha in the footer settings. Alright, have a source and the URL and then it uses the text and makes it clickable link. Awesome. Thank you. Thank you.
Kind of a similar question. I googled Hans Rosling and I saw that you know, you can see their their posts pop up that say how to make a date of is like Hans Hans Rosling Yeah. Do you have similar kinds of recipes on your site? And if so, where do you find them?
Yes, we do. I mean, your old like getting ahead of the resources we're going to share but yes, we do. So we share inspirational content, so meaning teaching people how to recreate charts or create certain, like specific goal reach a specific goal with a chart in our blog. We also have our help dots. And on our blog, we do have a Master series we call it Master series, where we recreated really famous charts and I believe Hans Rosling see there there should be there.
More than will be blog posts, we actually have a starting point here where it's like Hans Rosling is famous scatterplot that you can just click and then the duplicates it into your profile and then you can take it from there, but we definitely need a Master series on that. Yeah, we have loads of links and really helpful resources that you can get started with that we'll be sharing as well in the slides.
I mean, I might as well just go over them very quickly. Well, really quickly. Again, if you really do have questions and we run out of time here, come to the creative juices bar and reach out to us. We're super happy to chat about anything charts, database charts and gamba, charting flourish. And in terms of resources, I'm just going to go very quickly over this but we're still taking questions if you have any. On flourish specific. We've added the most helpful links for you to get started the flourish Help Center where all of our hundreds of help books live. This is if you want to achieve a specific result troubleshoot over a chart, or webinars we host them monthly. We're currently on a summer break, but we'll be back in September. These are one hour sessions going over different topics, hot tips on data visualization, topical sessions around how to cover certain areas in journalism or other other areas like business reports and so on and so forth. Our flourish blog will where we public published pieces and again, we recreate charts or we write about interesting topics we did one about heat waves and climate change just recently, football as well with the Premier League so you can really find anything on everything there are training series which is completely free. You can get started with flourish and learn the basics of our most basic templates through there are SDK which is our advanced software developer kit if you are interested in developing your own templates. Moreover, we do have other tools and resources that are external to flourish but that we swear by like the Financial Times visual vocabulary a wonderful resource if you're starting in data visualization and want to learn more highly recommend and other things for you know, you can source your Jason's colors for maps, etc. And I mean Lastly, we didn't include this in case any of you were like super curious about data visualization and wanted to like take a deep dive some of our book wrecks just to get started. We tried to make it like as beginner friendly as possible, meaning like we added like some different levels. But yeah, these are some of the wrecks that we actually have in the library and on our London office. So it's always nice. And again, prompting you guys to go and figure out which chart type matches your personality with our fun little quiz
are both scatterplots apparently, yeah, we want to be a map but we're not. So if
you're a mob do let us know. But yeah, I mean if you're interested and just have like two minutes to spare, you can scan the QR code, complete the quiz and let us know which type you are. But lastly, I'm going to leave this slide here for a second can see a bunch of people scanning the QR code now
if you can't scan it now you can scan it at the newsstand.
You can see a common trend going on here. Um, yeah, but no last we just thank you so much for your time for your attention for being with us here today. We really enjoyed ourselves and yeah, we hope you enjoyed yourselves too.
Thank you
broken the long I feel the pain crashing down this empty town was searching through the lost and found that you don't care your arm keep moving like the scars aren't even. It's like a prison
gave you everything was you close by the stormy seas of human two words you made you cry
when I'm in pain you hop in this like shadows in the atmosphere charting a braid for you and catching me and helps you Chaisson way my fears are my own. You made it show
you when you love me, us because I leave the home when me.