Tomorrow is this talk by talented Paul Magwene. He is one of the deepest thinker of my former trainees.
@lpachter kindly pointed Paul, Paul Lizardi, and I conceptualized pseudotime and I want to use this to talk about method dev in compbio
1/n https://twitter.com/blkwomencompbio/status/1323307261616336898
@lpachter kindly pointed Paul, Paul Lizardi, and I conceptualized pseudotime and I want to use this to talk about method dev in compbio
1/n https://twitter.com/blkwomencompbio/status/1323307261616336898
Early years of func genomics, Spellman et al. was a landmark dataset for studying the transcriptome. https://www.molbiolcell.org/doi/full/10.1091/mbc.9.12.3273
My former colleague Paul Lizardi, who was interested in cancer progression asked me about the possibility of using transcriptome for time ordering
2/n
My former colleague Paul Lizardi, who was interested in cancer progression asked me about the possibility of using transcriptome for time ordering
2/n
Paul L. is an incredibly creative biologist who invented rolling circle mol. beacon, methylome for long read assemb, universal microarray, etc. So, if anybody is to be given credit for conceptualizing time reconstruction from func genomics data, it should be Paul L.
3/n
3/n
When Paul M. and I started thinking about the problem, the first thought was using Hastie's principal curves, which I didn’t know about at the time and Paul taught me. But, given the sparcity of the data (~18 time points), it didn’t seem reasonable.
http://web.stanford.edu/~hastie/Papers/Principal_Curves.pdf
4/n
http://web.stanford.edu/~hastie/Papers/Principal_Curves.pdf
4/n
Additional inspirations came from:
Travel Salesman Problem curve reconst.
Nina Amenta et al., work on combinatorial shape
Tenenbaum et al. nonlinear dim reduction.
https://dl.acm.org/doi/abs/10.5555/338219.338627
https://escholarship.org/uc/item/8pb179vt
https://science.sciencemag.org/content/290/5500/2319.full
5/n
Travel Salesman Problem curve reconst.
Nina Amenta et al., work on combinatorial shape
Tenenbaum et al. nonlinear dim reduction.
https://dl.acm.org/doi/abs/10.5555/338219.338627
https://escholarship.org/uc/item/8pb179vt
https://science.sciencemag.org/content/290/5500/2319.full
5/n
Given the noise and sampling density, we settled on a tree-graph model with the idea of a diameter path. Then Paul and I discussed modifications we need to deal with high curvature (like cycling genes). He came up with the key P-Q data structure to order the possible paths.
6/n
6/n
And then we added a delta-shortest criterion to resolve possible path. We published our paper in 2003 below. But, didn't receive much attention until fortunately, C. Trapnell adopted the ideas into Monocle for single cell analysis.
7/n https://academic.oup.com/bioinformatics/article/19/7/842/197339
7/n https://academic.oup.com/bioinformatics/article/19/7/842/197339
So, my view on key to comp methods
(1) breadth of knowledge of methods/theories, (Paul had vast quant);
(2) depth of tech knowledge to engineer fit to the problem at hand, (again, Paul).
(3) numbers are numbers. Almost always there is preexisting body of knowledge.
8/n
(1) breadth of knowledge of methods/theories, (Paul had vast quant);
(2) depth of tech knowledge to engineer fit to the problem at hand, (again, Paul).
(3) numbers are numbers. Almost always there is preexisting body of knowledge.
8/n
New data types call for engineering adjustment, almost never de novo creation. Compbio people should be regarded based on their breadth and depth of knowledge, not popularity of their methods. Lot’s of VHS out there. So, again, I strongly recommend checking out Paul’s talk.
/end
/end