Twitter has announced that they're providing more access for academic researchers. As someone who's been studying the ethical implications of Twitter research for half a decade I do indeed have thoughts! (1) This is probably good, but only if (2) it is done properly. [Thread
] https://twitter.com/TwitterDev/status/1354143047324299264

First, if as a Twitter user your (understandable) reaction to Twitter's announcement was WTF SCIENTISTS ARE READING MY TWEETS?! or if as a researcher your (less understandable) reaction was "Ethics?! But the data is public!" then you should read this. :) https://howwegettonext.com/scientists-like-me-are-studying-your-tweets-are-you-ok-with-that-c2cfdfebf135
Twitter has been a HUGE trove of data for researchers for over a decade. In 2014 @zeynep pointed out the "model organism" problem--Twitter is the fruit fly of academic research. Not because it's the best to study, but because it's the easiest to study. https://arxiv.org/pdf/1403.7400.pdf
To be clear, it's not BAD that researchers are using publicly available data. There's been really important work in lots of different fields, including e.g. public health & disinformation. The problem is when "public" is taken as "and therefore ethics and privacy are irrelevant."
So making Twitter data available to more researchers could be a good thing - because it's not like the ability to PAY for data access means you're going to treat that data more ethically.
So a non-monetary hoop where the RESEARCH is what matters I think is an improvement.
So a non-monetary hoop where the RESEARCH is what matters I think is an improvement.
Some vetting of the purpose of research, what the outcomes will be, and how it will be used is a good thing. Though I confess, I wish there were more ethical scaffolding in these questions.
A few years ago, @moduloone and I published this paper about Twitter research ethics based on a survey of Twitter users, and it includes some thoughts about the things researchers should keep in mind beyond "is the data public." https://journals.sagepub.com/doi/10.1177/2056305118763366
Last year @BriannaDym and I published this paper about considerations for using public data from a particular community, that has some lessons for thinking about marginalized or vulnerable communities more generally. https://journal.transformativeworks.org/index.php/twc/article/view/1733/2445
There's no way to know how Twitter is vetting this research, but my hope would be that they're giving more scrutiny to, e.g., "we're going to use AI to diagnose Twitter users with a mental health condition and then include a codebook with direct quotes in our paper."
One thing I've appreciated though about Twitter's official dev policies is that they require "re-hydrating" shared datasets via tweet IDs rather than archiving data. This means that Twitter users can delete content and not have to worry about researchers having archived it.
Though I have no idea how this is enforced, and I'm not sure if this policy is tied to ethical reasoning - like, do researchers even realize potential harms here?
There could have been such useful scaffolding in those questions to make researchers think about these things .:(
There could have been such useful scaffolding in those questions to make researchers think about these things .:(
Though it's also worth mentioning that there's the potential for "vetting" research to go awry. My hope/expectation is that "will the outcomes of this research make Twitter look bad" is absolutely NOT part of the process.
I'm not sure how to mitigate this risk, except to say that if a researcher has their request denied and suspect it might be for this reason, please make a BIG stink about it.
Anyway, this is one of those rare times when I feel like I am actually a serious expert on something. @TwitterDev if you'd like some help thinking about how you'll do research vetting,
me. Or our whole crew. :) @pervade_team @michaelzimmer @jvitak @undersequoias @moduloone
