Favorite #NeurIPS2020 presentations and posters this year

PS: heavily biased by what I happened to catch and whom I happened to talk to
PPS: still catching up on talks so the list is rather incomplete and I'd hope to grow
PPPS: with contributions from @ml_collective members
[Talk] "Pain and Machine Learning" by @shakir_za
Appeared at Biological and Artificial Reinforcement Learning workshop
[Paper/Poster with tl;dr] https://neurips.cc/virtual/2020/protected/poster_9b8619251a19057cff70779273e95aa6.html Is normalization indispensable for training deep neural network? Nope. If you remove BN from ResNet training fails, but if you do some smart tricks like adding a rescaling parameter to the residual connection it's back to working
[Paper/Poster with tl;dr]
Top-k training of GANs. Simple trick! For each batch of generated images from G, pick only top k highest D score samples to backprop, zeroing out gradients from others
[Paper/Poster with tl;dr]
Instance Selection for GANs. Similar to above, but instead of sampling top generated samples, this one preprocess data (real images) so that only high probability/density ones are used.
[Paper/Poster with tl;dr]
Randomly dropping layers in transformer and moving layer norm out of residual connection helps
[Paper/Poster with tl;dr]
Train-by-Reconnect. This is fun! Perhaps my favorite! Random initialize network weights, then all training does is shuffling their locations. Somehow works!
[Paper/Poster with tl;dr]
What's being transferred in transfer learning? Turns out the downstream and upstream model live in the same local minima, and that you don't have to transfer from the last epoch
[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_ee23e7ad9b473ad072d57aaa9b2a5222.html "Meta-Learning through Hebbian Plasticity in Random Networks" use Hebbian rule evolved through ES to train a network that ends up being robust to significant weight perturbations! No gradients! @risi1979 @enasmel
[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_d8330f857a17c53d217014ee776bfd50.html "Measuring Robustness to Natural Distribution Shifts in Image Classification" How robust are models to non-synthetic perturbations? TLDR: not very robust
[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_405075699f065e43581f27d67bb68478.html Training is very nonlinear in the first few epochs , but after that very linear. After the first two epochs you can linearize training while maintaining acc (CIFAR) https://twitter.com/stanislavfort/status/1322246600320757760?s=20
[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_08425b881bcde94a383cd258cea331be.html SGD is greedy and gives non-diverse solutions. Instead, by splitting off at local minima (saddle point) and following eigenvectors with negative eigenvalues of the hessian, you can find diverse solutions https://twitter.com/j_foerst/status/1335288274273832966?s=20
[Paper/Poster with tl;dr]
https://neurips.cc/virtual/2020/protected/poster_ad1f8bb9b51f023cdc80cf94bb615aa9.html ❤️ We already know that network with random weights already contains subnetworks that perform well. But do you know that they can solve thousands of tasks continually w/o even knowing the task ID? https://twitter.com/Mitchnw/status/1278711255977492482?s=20
You can follow @savvyRL.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.