Been knocked off twitter for weeks by a COVID project - so I'm super happy to tweet about science and why I love my (non-pandemic) job! This is a story about learning new things, getting immersed in details, and making new friends. You know, a story about being in science! 1/18
Last year, I wrote some code to solve a computational problem with genomic data sets - we found Fu's Fs was a potentially super useful population genetics statistic for understanding *why* an outbreak of infections occurred (maybe kind of relevant now too) 2/18
The main issue was that Fu's Fs relies on numbers called Stirling numbers of the first kind. Stirling numbers in turn rely on combinatorics, which rapidly become too big for even modern computers (171! is too big for a standard double float). 3/18
Fortunately, there was a logarithmic method published in 1993 to calculate these huge numbers! This solved the problem, and I published a paper and a github project. https://twitter.com/swaine_chen/status/1075819655397281792?s=20
4/18
4/18
It was really fun to revisit both programming concepts (numerical underflow and overflow) as well as some math (limits, series, and (!) Green's theorem). But this is just the prologue. 5/18
The author of the 1993 Stirling number estimator is Nico Temme, a (now emeritus) mathematician at CWI Amsterdam ( https://homepages.cwi.nl/~nicot/ ). He saw my paper citing his method. And he emailed me. 6/17
This was pretty flattering already - but then he started asking questions about how I used his method. And within a few emails, he suggested that he could dramatically improve on what I'd done. 7/18
The basic idea: I was calculating a sum of terms, estimating a Stirling number for each term. Well...one of Nico's research areas was developing numerical estimators for sums of series. So he suggested calculating Fu's Fs (a sum of Stirling nums) using a single calculation! 8/18
Thus began a collaboration. He sent me a draft manuscript within weeks, in which most of the math had already been worked out. And it speeds things up by another 10-100x or more, depending on how big your data set is. 9/18
We just submitted the revisions for the paper. It's the most (only?) math I've done in decades. And it's been a pleasure to work in detail on a problem, where Nico and I have different expertise, and I am being stretched a lot intellectually in a different way. 10/18
It's online in the usual places - preprint on biorXiv ( https://doi.org/10.1101/2020.03.12.989392) and code on github ( https://github.com/swainechen/hfufs) 11/18
Fu's Fs, and maybe some other existing or yet-to-be-devised population genetics statistics, may end up being groundbreaking in our ability to use genomics to provide rapid, definitive answers to infection, cancer, and other big problems of our times...*may*, and in the future 12/
However, the speedup and the new math, to be honest, isn't going to register in today's pageantry of (advertised) hyper-impact. But...it is new work and new math and it was fun! 13/18
So, this story is about the reason many of us do science, which is in sharper focus for me these last few months. It's fun. It's fun to learn. It's fun to learn new things. Even if it's not COVID, or won't make a billion dollars, or get you a startup, or win you an award. 14/18
We do this because science, education, learning, curiosity, and *knowledge* have inherent value. This value is not referential to economics or politics. There may be some referential value, but knowledge and its pursuit have *inherent* value independent of that. 15/18
This is one of the *real* reasons many of us do science; at least, it is for me, and I suspect it's true for many others. This transcends the episodes of gloomy funding, the snipey politics, and the common times of being generally misunderstood. 16/18
We should remember and advocate the inherent value of science and scientists, which is (should) not be primarily defined by economics, politics, or anything not science. To say that science has value because of these peripheral benefits, I think, cheapens us and our field. 17/18
I love doing science and I know a lot of other people do too - please remember why you are doing this when the vagaries of practicalities get you down! Solving a problem, deriving true understanding, and doing good science - these are seen by many of us! 18/18