New paper: Foundations of a Fast, Data-Driven, Machine-Learned Simulator
https://arxiv.org/abs/2101.08944
with a fun BIG IDEA: instead of using ML to *mimic* slow detector sim, use ML on *real data* to learn the *real* detector. Then maybe we can *replace* slow simulators!
https://arxiv.org/abs/2101.08944
with a fun BIG IDEA: instead of using ML to *mimic* slow detector sim, use ML on *real data* to learn the *real* detector. Then maybe we can *replace* slow simulators!
(Led by my amazing student Jessica Howard, with UCI ML folks.)
Most fast simulation these days use GANs, which can learn to mimic the output of an expensive simulation run. But what if you want to simulate something else? You have to first run the slow simulator.
Most fast simulation these days use GANs, which can learn to mimic the output of an expensive simulation run. But what if you want to simulate something else? You have to first run the slow simulator.
Why can’t we just learn how to simulate the detector… from the real detector? Because we don’t know what the *truth* is — we can’t observe the latent data about the real particles from the collision. So supervised learning is impossible.
But there are places where we know the theory, so we don’t observe the particles directly but we know their distributions very well. Can we do some kind of unsupervised learning, to deduce how the detector transforms things?
Jessica realized that Varational Autoencoders (VAEs) are almost the right tool. They learn to map from observed X to latent Z and back. Her idea was to make Z the *physical latent space*, the unobserved particles.
This required using a new kind of auto-encoder that can handle this setup, a Sliced Wasserstein Autoencoder, inspired by optimal transport theory, and trained in an *unsupervised manner*.
(As a bonus, you learn the mapping X to Z, useful for unfolding)
The idea is to learn the detector transformation in control regions, and extrapolate. Just like the slow simulation is tuned in data control regions.
Performance isn’t yet perfect, but this paper establishes the foundation of a potentially new approach to simulation.
And it should have applications to *any* field that uses simulations to model how latent data get transformed through observation.