For end users, participants, people for whom data is being collected, analyzed, reported, shared, etc., we've done a few studies. And a lot of this started with Casey Fiesler, who's a professor at the University of Colorado Boulder. She had done some work in the very beginning of this project with Nick Proferes, who's a faculty member at Arizona State. And they were looking at Twitter users because Twitter is probably the most common social media platform for scraping. The terms of service allow it. There's tremendous amounts of data, very easy, you don't need any skills to pull it at this point. And so what they did is they interviewed people and did a survey with Twitter users about whether they had any knowledge of this practice, that researchers were regularly scraping the platform, and how they felt about it.
One of the things that we hear again and again is this argument that if the data is public, it's a free-for-all. Sure, you post publicly on Twitter, you basically lose all rights to it. You're assuming that it will be used in various ways. And Casey's work found that is not how users feel about it at all. They were surprised. They didn't actually have expectations of their data being used in that way. Their frame for thinking about tweets was rather narrow. And they wanted some kind of notification, even if that wasn't consent. They wanted to be alerted.
So something that Katie, myself, and our postdoc at the time, Sarah Gilbert, did is we ran a series of four studies looking at four different platforms: Facebook, Instagram, Reddit, and dating apps. Now, we used a method called factorial vignettes, which are really useful for ethical questions because what they allow us to do is to present people with multiple vignettes and very small things in them, and that allows us to start to see what are the individual contextual factors that shape people's attitudes, and how small shifts can make something go from being viewed as appropriate to inappropriate. Probably the biggest finding is reinforcing what Casey found around consent and notification. People very clearly want to learn about this, ideally beforehand, and have an option for not participating, like an opt-out of research at the platform level. But even if not that, they want to be notified at some point, even if it's after the fact. The takeaway here is that from an ethical perspective, relying just on the legal aspect of it does not make it ethical, and that users are not thinking in these very simplistic ways, that “I'm signing up for a platform. I acknowledge that everything that I share on this platform can be used in any way.”