Let's talk about the #MedBikini study (paper now retracted) and research ethics. Problematic analysis/findings aside (bikinis, really?!), the paper also highlighted data sources and ethics review processes that make people uncomfortable. Guess what? This is super common. [Thread]
If you follow me, you probably already know this, but "public" data doesn't fall under the legal definition of human subjects research in the U.S. b/c you're not interacting directly or collecting "identifiable" info. Therefore, many IRBs don't consider this under their purview.
Consider the many, many datasets that consist of tweets, reddit posts, etc. Many of the researchers who work with this "data" (because it's research with DATA not PEOPLE!) might not have ever even interacted with an IRB or thought of their research as involving humans.
The #medbikini paper states that all of the data analyzed was "public." I assume this means, e.g., not friends-only Facebook posts or private Instagram feeds. (That said, the fact that the researchers created fake accounts to collet the data seems squicky and unnecessary.)
People tell me ALL THE TIME that "if you put something online you should expect people to see it, for it to be used in datasets, etc."
But we all know that, intellectual understanding aside, when you tweet, your expectation is that you are tweeting to your followers.
But we all know that, intellectual understanding aside, when you tweet, your expectation is that you are tweeting to your followers.
How many of you have photos on a personal instagram account of an awesome cocktail you got at a restaurant?
But do you really expect (a) your colleagues to be creeping on those pictures for science; and (b) to be labeled "unprofessional" in that context for a published paper?
But do you really expect (a) your colleagues to be creeping on those pictures for science; and (b) to be labeled "unprofessional" in that context for a published paper?
But here's another point that doesn't come up often for research using public data: researcher positionality. How do we feel about the researchers collecting data all being men, and then judging bikini photos as unprofessional? https://twitter.com/DrChowdharyMD/status/1286475918823825415
Or here's another example: Do you think it is appropriate for an all-white research team to scrape a million BLM tweets and then analyze them and publish a paper about it? This is an ethical issue that comes up in human subjects research; shouldn't it here as well?
This article (h/t @IEthics) brings up the "IRBs don't know how to deal with this" problem like it something new, which it definitely isn't. But the more attention to this, the better. https://news.bloomberglaw.com/pharma-and-life-sciences/medbikini-backlash-exposes-research-ethics-boards-digital-gaps
As in the article above, @moduloone and I analyzed ethical considerations for using public data (specifically, tweets) in terms of the Belmont Report principles, including justice. The #medbikini case provides some interesting nuance to that. https://journals.sagepub.com/doi/10.1177/2056305118763366
Also for further reading: @jvitak et al.'s paper on how IRBs handle this kind of thing in practice and the challenges they face. https://journals.sagepub.com/doi/full/10.1177/1556264617725200
Final thought: How might reaction to the #MedBikini article have been different if they had included an "ethical considerations" section in their methods that included the steps they took to maintain privacy, and why they thought the RQ was important enough to justify the method?