My #SciPy2019 keynote: The New Era of NLP

A few myths about Deep Learning:
- You need “big data”
- You need lots of expensive computing power
- You need a deep learning PhD from Stanford
- Deep Learning only works for very limited problems
None of these are true.
Transfer learning is a powerful technique that doesn't require that you have big data or tons of GPUs. Transfer learning can achieve state-of-the-art results on a range of problems.
Word embeddings are like 1 pre-trained layer for your network- what if you could use many pre-trained layers?
Now you can! Transfer learning is being successfully applied to NLP, beginning with ULMFit ( @jeremyphoward @seb_ruder) and continuing in BERT, GPT-2, XLNet, & more
But more sophisticated language models pose heightened risks of disinformation. In 2017, the FCC received millions of faked comments (and this was BEFORE recent advances in NLP): https://twitter.com/math_rachel/status/1140067404078125057?s=20
Humans evolved as social beings, to have our opinions influenced by others in our group. Extreme viewpoints can be normalized when we think we are around others who hold those views

Advances in NLP have potential to scale up the online radicalization that is already taking place
Comments generated by language models can be combined with GAN-generated photos to create more compelling fake profiles, and to manipulate public opinion.
It is not just about fakes being compelling: people often respond to media that confirms their pre-existing beliefs.

Many more people will see the incendiary, false version of a story, than the fact-checked correction that goes out later.
Dissidents can put their lives on the lines to post a picture documenting wrongdoing, only to be faced with false claims by bad actors that the picture was faked. @zeynep wrote about this in Wired: https://www.wired.com/story/zeynep-tufekci-facts-fake-news-verification/
Recommendation systems amplify & incentivize disinformation. Please follow @gchaslot for his expertise on this topic, and also read: https://www.fast.ai/2019/05/28/google-nyt-mohan/
Also, recommendation systems can be gamed by bad actors: https://twitter.com/gchaslot/status/1121603851675553793?s=20
Oren @etzioni wrote in HBR about the need for digital signatures to address forgery/fakes: https://twitter.com/math_rachel/status/1113260397862088704?s=20
Renee DiResta @noUpside has written that we need to define disinformation as a cybersecurity problem, not a content problem: https://twitter.com/math_rachel/status/1121898808131108865?s=20
You can follow @math_rachel.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.