EMTECH 171109_0042 technology ai sns : state sponsored online content campaigns
6:17PM Mar 16, 2021
Speakers:
Keywords:
people
isis
internet
question
technology
ai
campaigns
challenges
state sponsored
online
content
conversations
google
comment
web
model
understand
yasmin
tweet
temporal dimension
The World Wide Web is kind of living outside of the West. Most two thirds of the worlds are living in censorship, they don't have access to a free and open Internet. And the implications for that are actually really severe, in terms of people's livelihoods in terms of their ability to think freely speak freely and protest. And they wanted to see if we could use the incredible resources of massive technology company to help promote freedom around the world.
So I mean, there are these sort of closed areas of the web, but generally, the benefits of it is that it is open. But of course, you know, as I noted earlier, there are outfits like ISIS and Russian hackers, who are experts at kind of leveraging social media to achieve their goals. What are the most important lessons that we've learned from recent experience with dealing with these challenges? Were? Yeah,
these challenges are so broad on on ISIS specifically, I think what they represent is the first real 21st century terrorist organization. They were until very recently thriving both in the physical domain and in the digital domain. And we saw an ability to recruit and evangelize that we hadn't seen in terrorist groups before then. And I you know, while ISIS, his influences is waning I, the model now they have set a kind of perverse best practice for other extremist groups who want to, you know, organize, and radicalize online.
And can we unpack like a couple of those elements? And what is this about? Okay,
so it makes them so good, right? So
what they what they're trying to do in terms of recruitment, and so bad as a as an influence on global
security. And so, you know, it's, it's kind of it's honestly, the micro targeting, so they, they appeal to so many different elements of society with a message that resonates with them. So part of our model at jigsaw is to go and actually go to the frontlines of different conflicts and challenges and repressive societies and speak directly to the people who are affected victims, but actually, sometimes perpetrators. So we went to, we actually went to a team to Iraq to speak to defectors from ISIS, people who had left home signed up to go and live in the caliphate. Some of them had trained as jihadi fighters. So people who are willing to be suicide bombers, and have been disillusioned on arrival and then defected. And we said, you know, tell us why you joined and why what was appealing, and they said, I wanted to come and fight for the rights of Muslims, interviewed a girl in Europe who tried to leave at the age of 13, to sign up, you know, to go and live in under ISIS is rule. And she said, Well, I thought I was gonna go and live in the Islamic Disney World. And you start to understand that we see like beheading videos, and we we think that's what ISIS is proposition is, actually to these groups. It's something very different. And one more really interesting anecdote on that is rather, factoid is that they create content in so many languages like English, Arabic, Russian, French, but you go down the long tail of languages in which ISIS has created content, and you find Hebrew and Mandarin, and even sign language. So like it when you talk about equal opportunities, a recruiter as all of our organizations want to be like, we're outclassed when, you know, by ISIS recruitment campaign,
and so what can a company like Google do to to use technology to try to counter this, this
growing threat? So sitting in front of like a kind of a failed suicide bomber, or somebody who thought that they were going to go and die for ISIS? And then, you know, saw the brutality and corruption and devastation of the caliphate and had risked their life to leave? And I say, Well, if you knew everything that you know, now, if you knew that before you left, would you still have made the same decision? And the guy sitting in front of me says, yes. And I say that. He said, at that point, I was so brainwashed, I was so completely convinced I wasn't accepting any information to contradict my conviction that I needed to go, this is a consistent answer. At the point that people decide to leave, it's really too late to reach them. So you say, Well, what if you knew everything that you know, now, six months before you left, would you still have made that decision? And he said, Nothing. I think I would change my mind. So the takeaway for us doing that kind of field work, you know, as a technology company is a access to information can be a game changer here and be don't go off to people when they're sold. It's too late. You have to reach them when they're sympathetic, but not yet committed. So we came back and
kind of under certain kinds of content like beheading videos is obviously wants to take those down. But you know, how do you deal with most subtle forms of propaganda. And is it the responsibility of a Google to take that off the web? When you're actually campaigning for an open web? What's the tension there?
Yeah, that's a really good question. So there's a there's a category of content that's so egregious, it should not be allowed to stay on the internet, the headings for making tutorials, graphic violence. But what about the radicalizing narratives that that actually convince fairly normal people to sign up and go, you know, 40,000 plus people left their home to go and fight faces. They're responding to narratives that speak to them. So the goal of our goal was, you know, can we use actually targeted advertising? Which million organizations use to find like their slice of the online internet community to? To reach people who are sympathetic, can we design a set of keywords 1000s of keywords that could target people who might have positive sentiment towards ISIS, but they're not yet sold. They're not planning their trip to rapper. But they're engaging in ISIS supporters slogans and ISIS media channels, and you know, the nitty gritty of the conflict that only somebody who's really committed to participating would be interested in. And if we can reach them with targeted advertising. And this is something that Phil, you know, that bill's Google kind of pretty good at. It doesn't say that Google, Google realizes $100 billion for targeted advertising, pretty powerful algorithms. If we can use it to reach people who are susceptible to ISIS as messaging, can we then redirect them to content online videos of defectors of credible moms of citizen journalism, that will give them kind of better information so they can make better choices?
So you're doing is there any evidence of success of this? Right, so
we designed this program based on this targeted ads and YouTube videos, and we called it the redirect method. we piloted it last year, for eight weeks, and English and Arabic, and it reached, you're looking for simple people, sympathetic diocese, it reached 300,000 people in eight weeks and achieve really high click through rates on the ads, which means that ads are engaging, and then half a million minutes watched on YouTube, which is pretty powerful. Because no one watches YouTube from you press the back button if you're being shown content that you don't like. And the goal is we're showing people actually answers to the questions they have. You're saying your questions about it, religious legitimacy, good governance, you know, military successes of ISIS. All of those questions that you ask, there are answers to that are different from ISIS his answers, and we're going to show them to you, right?
That's great. Switching gears a bit. I mean, the web is also being used to sort of vehicle hate speech and other forms of abuse. And some web companies have been criticized for being too slow to react to this and to use technology to tackle it. Is that criticism fair? Do you think?
Yes, good.
Okay. Moving on. Okay, as you, as you rightly points out, so what challenges have you encountered? I'm sure you've been trying to deal with this problem. But what challenges have you encountered in using technology to address
it? So again, because jigsaw usually comes at challenges from through this geopolitical vantage point. And a few years ago, we went to Myanmar, and observed that there was this tension, the country's liberalizing, and people coming online, and there was this spillover to the Internet of this, the tensions between the Buddhists and the Rohingya Muslims. That was several years ago, we came back and said, You know, this is the next billions of people that come online, are going to be bringing their physical world realities with them. And we are going to need to do better at helping, you know, create inclusive and have empathetic conversations. It's interesting, because you know, that that situation turned into has turned into a genocide. But the The reason we were hopeful that technology could do something about it was because of advances in natural language processing, and the ability of machines to understand speech and context. And also so what was that doing?
What was how was that being applied that natural enemies, right,
so, so the the vision for the internet to, to, you know, democratize information and connect us to one another. And we found that actually, comment spaces are not what we thought they were going to be. I don't think anyone imagined the amount of toxicity and incivility that that we'd see in common spaces. I mean, think about yourself, like seeing really important conversations, which you decided not to participate in, because it's just not worth it. And we found that publishers have been turning off comments, you know, Chicago, Sun Times, Reuters NPR vice. There's actually a contraction of space online for us to, you know, meet each other and to exchange ideas. And that's because until fairly recently, technology hasn't been able to to assist moderators in dealing with the scale of the channel. have online comments. So we invested in a team of mostly AI, software engineers, research scientists and product people to see if we could create the right size and substance of data set where we could teach models to understand context. And to give you a little example of what it looks like to have a keyword based system for making decisions about speech versus, you know, machine learning one which uses patterns to understand what what speeches meant, what was meant by speech, you know, some the word like kill, you might think, or somebody is talking about killing, you know, in a comment, that's probably one that we would want to moderate out. But of course, if I say, you know, you're killing it today, Martin, you know, but yeah, versus I'll kill you today, Martin. Yeah, it's actually similar words, something different, you will understand the difference. But that's a quite a difficult problem to expect to expect machines to solve. But that's, that's what we know, there is very encouraging results from our early technology development to say that we can teach machines to understand context, which means that we can empower moderators and online communities with really thick, powerful, scalable tools to help with managing online conversations. And the goal, of course, is to expand the space that we have online to meet each other to create more inclusive conversations. But you know, the totality of knowledge on the Internet of information extends beyond what you agree with. So if we're really going to realize the promise of the internet, we have to allow people to meet and disagree and understand one another's positions.
Right. And so that's kind of a segue into where I wanted to get to next, which is around this whole issue of fake news. And the the rise of sort of Russian troll farms churning out videos churning out fake fake articles. And how far do you think we've got to grips with this problem? I mean, initially, Facebook said, it's 10 million people who've seen this stuff. Now it's 126 million. There was 1000s of videos uploaded to YouTube by a Russian, the internet agency that was like a front for Russia. Do we really know the full extent of this yet?
The interesting thing about what happened in you know, what, what has been reported. And what we understand about what happened in the case of the internet research agency, is a lot of the the activity was actually not terribly, technically sophisticated. The challenge was that there was a campaign to exploit the open internet and to exploit the vulnerabilities of social networks. It was the issue was not necessarily that some of the content was being shared, because it wasn't materially different to a lot of other content we've seen, it was that it was being shared. covertly, it was, you know, inauthentic. It there was an infiltration campaign of social networks based on our like, fundamental understanding that you trust people in your, in your small social graph more than you trust, someone you you don't know or have no connection to. So I think the bigger picture is that there is something that about the vulnerability of the internet, which we need to pay attention to and solve if we want to, if we want, you know, if we want to stop this information from spreading, and what
kind of technological solutions of dimension should we be looking at to think about that? I mean, what are you working on?
Right? So firstly, there is no law on the internet that says that, you can't lie. And I personally would not want that to be. But I think setting aside the actual content of the material, there is another question about how we can spot the kinds of coordination that we see in state sponsored propaganda. Though, the, the way, I kind of describe this as a seed and fertilizer campaign, the goal of these state sponsored network campaigns is to plant a seed in social conversations, and to have kind of the unwitting masses fertilize that seed and for it to actually become an organic conversation, the goal is to influence the masses. So then the question from our kind of technology response is, are there technical markets of organic activity, as opposed to this coordinated activity that we know that that states are engaging in and there are a few dimensions that we think of promising to look out for these technical markers. One is the temporal dimension. We we've looked at Coordinated and organic campaigns across contexts across countries and we found that coordinated campaigns tend to move together. They tend to outlast organic campaigns. And often there's a little bit of a delay for kind of, I don't really like the term troll but kind of state sponsored actors online. They they move together and they wait To get their hymn sheet, they wait to get the instructions on what to do. So you'll see like a little delay before they act. Can we kind of can we crystallize what those generalizable signals are on the temporal dimension, to use an automated detection, and couple more network shape. So you'll see that, you know, if we all want to tweet about m tech, as we all are doing, and you look at how we're connected, you'll find that actually like, with, you know, some of us are connected to each other, you're probably connected to some people in this room, but we're not all connected to one another in a very tight knit way. And when we've looked back at Coordinated campaigns, we found that they have tended to be, you know, clustered in a way that's anomalous, again, that we might have to have detect. And one more, if I may, is semantic. You know, if I, if I said to this group, let's all tweet now about how how stunning Martin's pleaser is, for those who haven't yet tweeted about that we would use just based on my direction, and the language I used in my direction, the content of what we those tweets would be very similar, kind of irregularly similar, versus the people who tweeted that before I gave that comment, and genuinely held that we would have, you know, a greatest semantic diversity. So again, can we look for markers and semantics? It
was really interesting. And I mean, alphabet, you know, you're fundamentally an AI company, we had a great presentation yesterday, from one of our innovators under 35. Ian Goodfellow, who was talking about generative adversarial networks, which is sort of AI trying to beat AI creating videos that are as good as real, but they're fake. This is like a new kind of dimension to this challenge. How is Google using tube AI tools more generally? And do you see the bad guy is using them more actively?
I mean, I should probably speak on behalf of what we see, Jake. So when we look at when we look, especially at terrorist groups, who are exploiting social media, when we look at state sponsored efforts to to influence a minute PD, actually, they're using really, really powerful algorithms that are at everyone's disposal. So if you're using, you know, targeted advertising, that's, that's a, you know, like a half a billion dollar business that represented by that. So I would say that you don't need to develop new AI and new algorithms, to be able to have them at your disposal and activity, explain them.
Just very briefly, I'm going to open to the floor to questions in a minute. So start thinking of questions for Yasmin. But one very brief question. We heard earlier today about the threat of algorithmic bias, and what can we do to minimize this man? How do you think about dealing with that issue?
That's very good question. We have a project that we have a project that's looking at understanding speech and helping moderators I didn't mention his name, which is perspective, the tool that we've made public for, for people to go to perspective ai api.com. And you can play with it. And you can see where it does Well, interestingly, when you when you offer a group of smart people, an AI to use that instinct as a seed making triggered, so please do try and trick it. That's actually very helpful feedback for us. You know, can you can you find examples of really bad speech that it won't catch or, you know, good speech that it might think is bad. But we found that the models aren't perfect, and indeed, they are biased? And you have to ask yourself, when you're developing technology, what kind of biases Okay, if you're trying to build AI to to lead to more inclusive conversations, then a racist or sexist, socialistic The aim is not acceptable. So we found that we found, for example, that it really worked, especially when when we launched and we've we've made several releases to address this, it was missing certain types of bias. As an example, if I were to say, you know, I'm Yasmin is a researcher, it would not think that was toxic. If I said, Yasmin is a feminist researcher, if the model thought that that was likely to be a toxic comment, it was likely to be a comment that would want to make people leave a conversation. Why is that? Because we trained the AI, you know, we train these models based on internet data from the internet. And so it's starting to mirror patterns in online comments. And words like feminism, like gay like Muslim, besser, this, this disproportionately skewed towards comments that have a negative effect on people that the model started to assume that they themselves intrinsically had some negative properties. So we've we've actually, we've taken news articles that that mentioned these, you know, in a neutral way that mentioned these points and retrain the data, and we've kind of published some of our techniques to Yeah,
right. I'd like to take a couple of questions from the floor. There's one back there, can we And then I'll come to you in a second.
Whoa. Yeah, I think in the senate testimony by Facebook's attorneys, they said that some of the Russian banner ads had been seen by 126 million Facebook users at least. But then down in the text of that testimony had said that on average, those 120 6 million people had seen 5000 banner ads a day. So how, what's the best practice today for measuring the effectiveness of that kind of
question?
What's Yeah, yeah.
There's something rhetorical in your question, which is, I think you recognize that we don't know how to measure it. And I think that I really appreciate the question. I think that there's a conflation of between attempt and impact. And at the moment, that's why there's numbers are so large, because they have to run after like Roundup. But I think we have to start thinking as technology companies, we have to start thinking of really doing offline research really doing more work with affected communities to try to make the link what is the impact of malicious activity and our efforts to mitigate against it?
Great, thank you. Great question. Question over here? Yes, no? Okay, over here.
Thank you, Greg perspective, I think the approach you outline using targeting for countering propaganda makes a lot of sense. So I have two questions. One, should a company like Google be doing it? Because what would prevent Google from deciding, you know, you're going to target something to serve your political aims will not be common citizen aims? And the second, I mean, Google already uses indexing to decide relevance. Why can it not develop measures like veracity, or ethical uprightness as measures as well?
Pretty good questions. Hopefully, we'll run out of time to answer them. We have run over for you. Alright. The first question is, I think when when when you're taking information off the internet, the bar has to be really high for you know, accuracy. And also, should you be doing this and search is supposed to be a, you know, a reflection of what's on the web at large, so, some make that bar really high. When you're talking about giving people more information, I think the bar is, it's a different calculus. So in the case of the redirect method, it really is about showing people through targeted advertising, inviting them to look at different perspectives on a challenge. And that program kind of doesn't utilize any kind of Google secret sources. It's open source, and anyone can use it. So if you wanted to use it to address maybe misinformation about vaccinations or you know, other types of organized by that group, then anybody here could also deploy it. To your second question, or any remember that it was a tough question. I don't know what it was. It was it
was around kind of like,
the Alright, okay. So, you know, most of the internet is organized around two fundamental principles, popularity and personalization. And the assumption has been, until now really, that veracity comes hand in hand with those things, popularity and personalization. That's not an assumption that I think we're we're under the illusion that we can continue to make. So the challenge for any internet platform that believes fundamentally in an open web is finding the signals that do a good job at filtering out spam and misinformation, while still promoting free speech. And we clearly haven't found the full set of signals.
There's great questions, I would love to take some more unfortunately, we have to move on. Thank you, Yasmin, for that wonderful presentation. It's now my pleasure to introduce our next group of innovators under 35