Thread by @tinysubversions, Just came across a really great paper about bots and misinformation. "Middle [...]

Just came across a really great paper about bots and misinformation. "Middle East Twitter bots and the covid-19 infodemic", by @abulkhaezuran @nouraaljizawi at @citizenlab. I think it's a preprint?

PDF: http://workshop-proceedings.icwsm.org/pdf/2020_17.pdf

The researchers looked at retweet networks promoting 149 covid-related hashtags & key terms on Middle Eastern twitter, they found that only 4-7% of the accounts were plausibly bots. They identified bots using context and metadata rather than an automated solution like Botometer.

They found that even among the retweet networks there was actually very little misinformation! That is, most of the information posted by these networks was true or at least not verifiably false. Opinion perhaps, certainly expressing a point of view, but not full-on lies.

They DID find thousands of accounts created all in a short time span that were behaving strangely. Important to note that these anomalous accounts represented a mere 3.6%-7.1% of all posts on the covid keyword terms, so proportion of abnormal activity was pretty low.

Next they did what I want all researchers to do: they manually inspected the flagged accounts and tried to divine the context of what was going on.

It continues to blow my mind that data-focused researchers often skip this step. I think it's simply unethical to skip this step.

They picked 3 different networks of accounts that seemed to be coordinating with one another and manually investigated the content. I'll call them networks A, B, and C.

Network A appeared to be an activist campaign by Egyptian high school students to postpone exams until the pandemic ends.

Network B was pushing Saudi government-backed messages, but the network is mostly just advising people to do social distancing along with some support for the royal leadership. Hardly salacious and I've written about similar behavior in a US context. https://twitter.com/tinysubversions/status/1267180433961177089

https://twitter.com/tinysubversions/status/1267180433961177089

Network C was a Persian language network that seemed to be piggybacking off of popular covid hashtags to promote human rights causes related to Bahrain and Palestine. Certainly opinionated, and it's certainly trying to game the trending algorithm for exposure. But not misinfo.

The few accounts that DID seem like bots tended to promote info that had very little uptake in popular discourse. "Parochial hashtags" in the words of the researchers.

My thought: when bot activity is correlated with unpopular stuff, the bots might not be, you know, effective.

Love this paragraph: "The statistical anomalies of some Twitter networks may turn out [...] to be artefacts of authentic coordination by citizens engaging in associational activities." Even state-aligned networks "may be promoting public health".

Anyway kudos to @abulkhaezuran and @nouraaljizawi for the great work here.

I'll repeat myself for any researchers reading: do the hard work. Manually audit and inspect your own data. Don't just trust your statistical analyses and assume "anomalous behavior must be bad".

Oops, sorry, this is not a preprint! My mistake. I meant to edit that out but writing long threads is... weird and hard. The paper is published, as a conference paper with AAAI.

Latest Threads Unrolled: