This is only going to appeal to a niche audience, but here we go.
When you fit a simulated (sampled) model using ML estimation, do you ever encounter log(0), and then replace that by log(some small number) so that your code runs?
I used to do that all the time. Bad idea!! 1/
When you fit a simulated (sampled) model using ML estimation, do you ever encounter log(0), and then replace that by log(some small number) so that your code runs?
I used to do that all the time. Bad idea!! 1/
A few years ago, Bas van Opheusden explained to me that this inevitably leads to biases, because log(true probability) can go all the way to negative infinity, whereas log(small number) cannot.
And this is just one symptom. Even if you don't run into log(0) issues,
2/
And this is just one symptom. Even if you don't run into log(0) issues,
2/
the log(proportion of samples) estimator ("fixed sampling") is biased no matter what.
To fix the issue, Bas reinvented what turned out to be an old method by the Dutch statistician De Groot (1959), https://projecteuclid.org/euclid.aoms/1177706361.
3/
To fix the issue, Bas reinvented what turned out to be an old method by the Dutch statistician De Groot (1959), https://projecteuclid.org/euclid.aoms/1177706361.
3/
Basic idea: you sample UNTIL you obtain the observed outcome, and use HOW LONG IT TOOK YOU, plugging that into an equation. This gives a uniformly unbiased estimate of the log likelihood.
4/
4/
Bas worked with @AcerbiLuigi on computational improvements to reduce variance for a given runtime, interfacing with parameter optimization, and applications.
The paper just appeared in @PLOSCompBiol: file:///users2/weijima/Downloads/journal.pcbi.1008483.pdf
5/
The paper just appeared in @PLOSCompBiol: file:///users2/weijima/Downloads/journal.pcbi.1008483.pdf
5/
Highlights:
- Critique of the fixed sampling method: page 9 and Fig 2A.
- Five reasons why variance is bad but bias is much worse: page 14
- Applications to orientation discrimination, change localization, and four-in-a-row: pages 17-23
6/
- Critique of the fixed sampling method: page 9 and Fig 2A.
- Five reasons why variance is bad but bias is much worse: page 14
- Applications to orientation discrimination, change localization, and four-in-a-row: pages 17-23
6/
Finally: none of this applies if you have an analytical expression for the log likelihood, but sadly, most interesting computational models are not of that type.
Code on https://github.com/basvanopheusden/ibs-development
Hope this is useful!
7/end
Code on https://github.com/basvanopheusden/ibs-development
Hope this is useful!
7/end
Correct link to paper: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008483