Thread by @JayAlammar, So many fascinating ideas at yesterday's #blackboxNLP workshop at #emnlp2020. Too many [...]

Jay Alammar

JayAlammar

So many fascinating ideas at yesterday's #blackboxNLP workshop at #emnlp2020. Too many bookmarked papers. Some takeaways:
1- There's more room to adopt input saliency methods in NLP. With Grad*input and Integrated Gradients being key gradient-based methods.

See: https://www.aclweb.org/anthology/2020.blackboxnlp-1.14/ https://www.aclweb.org/anthology/2020.blackboxnlp-1.28.pdf https://www.aclweb.org/anthology/2020.emnlp-main.263.pdf

The elephant in the interpretability room: Why use attention as explanation when we have saliency...

Jasmijn Bastings, Katja Filippova. Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP. 2020.

https://www.aclweb.org/anthology/2020.blackboxnlp-1.14/

2- NLP language model (GPT2-XL especially -- rightmost in graph) accurately predict neural response in the human brain. The next-word prediction task robustly predicts neural scores. @IbanDlank @martin_schrimpf @ev_fedorenko

https://www.biorxiv.org/content/10.1101/2020.06.26.174482v1.full

This line investigating the human brain's "core language network" using fMRI is helping build hypotheses of what IS a language task and what is not. e.g. GPT3 doing arithmetic is beyond what the human brain language network is responsible for https://www.biorxiv.org/content/10.1101/696484v1.full

The language network is recruited but not required for non-verbal semantic processing

Consistent with longstanding findings from neuropsychology, several brain regions in left frontal and temporal cortex respond robustly and selectively to language [[1][1]]. These regions, often...

https://www.biorxiv.org/content/10.1101/696484v1.full

3- @roger_p_levy shows another way of comparing language models against the human brain in reading comprehension: humans take longer to read unexpected words -- that time correlates with the NLP model probability scores

https://cognitivesciencesociety.org/cogsci20/papers/0375/0375.pdf https://twitter.com/roger_p_levy/status/1329849700091092996

4- Causal graphs are slowly trickling in. An effort to empower NLP models with aspects of causal inference (see: @yudapearl's Book of Why)

https://www.aclweb.org/anthology/2020.emnlp-main.612.pdf
https://www.aclweb.org/anthology/2020.emnlp-main.173.pdf
https://www.aclweb.org/anthology/2020.emnlp-main.56.pdf
https://arxiv.org/pdf/2005.13407.pdf

You can follow @JayAlammar.

Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled: