Excited to share a new paper, Curve Circuits
We reverse engineer a non-trivial 50k+ parameter learned algorithm from the weights of a neural network and use its core ideas to craft an artificial artificial neural network from scratch that reimplements it
https://distill.pub/2020/circuits/curve-circuits/
We reverse engineer a non-trivial 50k+ parameter learned algorithm from the weights of a neural network and use its core ideas to craft an artificial artificial neural network from scratch that reimplements it
https://distill.pub/2020/circuits/curve-circuits/
I'm excited about this because no one expected Circuits — studying weights directly — to scale. But by leveraging neuron families and circuit motifs, it might.
50k weights is a far cry from GPT-3's 175b weights, but I think there's a chance the basic formula is right
50k weights is a far cry from GPT-3's 175b weights, but I think there's a chance the basic formula is right
I'm grateful to all my co-authors, but particularly to my mentor @ch402
When he was first playing with the ideas that would lead to Circuits I worried he was thinking too small. But now I see I was
When he was first playing with the ideas that would lead to Circuits I worried he was thinking too small. But now I see I was