Thread by @savvyRL, Favorite #NeurIPS2020 presentations and posters this year PS: heavily biased by what [...]

Favorite #NeurIPS2020 presentations and posters this year

PS: heavily biased by what I happened to catch and whom I happened to talk to
PPS: still catching up on talks so the list is rather incomplete and I'd hope to grow
PPPS: with contributions from @ml_collective members

[Talk] No. 1 has to go to -- keynote talk by @isbellHFh @mlittmancs et al simply brilliant

https://slideslive.com/38935825/you-cant-escape-hyperparameters-and-latent-variables-machine-learning-as-a-software-engineering-enterprise

You Can't Escape Hyperparameters and Latent Variables: Machine Learning as a Software Engineering...

https://slideslive.com/38935825/you-cant-escape-hyperparameters-and-latent-variables-machine-learning-as-a-software-engineering-enterprise

[Talk] "Where is machine learning going" by @BachFrancis https://slideslive.com/38938273/where-is-machine-learning-going appeared at http://preregister.science
Spoiler: https://twitter.com/savvyRL/status/1337421773344702465?s=20

[Talk] "Incentives for researchers" by Yoshua Bengio https://slideslive.com/38938274/incentives-for-researchers
appeared at the same workshop

Incentives for Researchers

https://slideslive.com/38938274/incentives-for-researchers

[Talk] "The importance of deconstruction" by Kilian Weinberger https://slideslive.com/38938218/the-importance-of-deconstruction
appeared at @MLRetrospective

The Importance of Deconstruction

https://slideslive.com/38938218/the-importance-of-deconstruction

[Talk] "Through the Eyes of Birds and Frogs" by @shakir_za
https://slideslive.com/38938216/through-the-eyes-of-birds-and-frogs
Appeared at @MLRetrospective

Through the Eyes of Birds and Frogs

https://slideslive.com/38938216/through-the-eyes-of-birds-and-frogs

[Talk] "Pain and Machine Learning" by @shakir_za
https://slideslive.com/38938071/pain-and-machine-learning
Appeared at Biological and Artificial Reinforcement Learning workshop

Pain and Machine Learning

https://slideslive.com/38938071/pain-and-machine-learning

[Paper/Poster with tl;dr] https://neurips.cc/virtual/2020/protected/poster_9b8619251a19057cff70779273e95aa6.html Is normalization indispensable for training deep neural network? Nope. If you remove BN from ResNet training fails, but if you do some smart tricks like adding a rescaling parameter to the residual connection it's back to working

[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_a851bd0d418b13310dd1e5e3ac7318ab.html
Top-k training of GANs. Simple trick! For each batch of generated images from G, pick only top k highest D score samples to backprop, zeroing out gradients from others

[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_99f6a934a7cf277f2eaece8e3ce619b2.html
Instance Selection for GANs. Similar to above, but instead of sampling top generated samples, this one preprocess data (real images) so that only high probability/density ones are used.

[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_a1140a3d0df1c81e24ae954d935e8926.html
Randomly dropping layers in transformer and moving layer norm out of residual connection helps

[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_f0682320ccbbb1f1fb1e795de5e5639a.html
Train-by-Reconnect. This is fun! Perhaps my favorite! Random initialize network weights, then all training does is shuffling their locations. Somehow works!

[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_0607f4c705595b911a4f3e7a127b44e0.html
What's being transferred in transfer learning? Turns out the downstream and upstream model live in the same local minima, and that you don't have to transfer from the last epoch

[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_ee23e7ad9b473ad072d57aaa9b2a5222.html "Meta-Learning through Hebbian Plasticity in Random Networks" use Hebbian rule evolved through ES to train a network that ends up being robust to significant weight perturbations! No gradients! @risi1979 @enasmel

[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_5c528e25e1fdeaf9d8160dc24dbf4d60.html
Learning convolutions from scratch -- improves generalization! @bneyshabur https://twitter.com/bneyshabur/status/1287936315829313536?s=20

[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_d8330f857a17c53d217014ee776bfd50.html "Measuring Robustness to Natural Distribution Shifts in Image Classification" How robust are models to non-synthetic perturbations? TLDR: not very robust

[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_405075699f065e43581f27d67bb68478.html Training is very nonlinear in the first few epochs , but after that very linear. After the first two epochs you can linearize training while maintaining acc (CIFAR) https://twitter.com/stanislavfort/status/1322246600320757760?s=20

[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_08425b881bcde94a383cd258cea331be.html SGD is greedy and gives non-diverse solutions. Instead, by splitting off at local minima (saddle point) and following eigenvectors with negative eigenvalues of the hessian, you can find diverse solutions https://twitter.com/j_foerst/status/1335288274273832966?s=20

[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_ad1f8bb9b51f023cdc80cf94bb615aa9.html

We already know that network with random weights already contains subnetworks that perform well. But do you know that they can solve thousands of tasks continually w/o even knowing the task ID? https://twitter.com/Mitchnw/status/1278711255977492482?s=20

[Tutorial]
https://slideslive.com/38935810/deep-implicit-layers-neural-odes-equilibrium-models-and-beyond Deep Implicit Layers. Very cool! https://twitter.com/zicokolter/status/1335961905429680130?s=20

Latest Threads Unrolled: