Sean Hackett High-Dimensional Data Science of Genomics and Sports


Identifying Causal Regulation at a Genome-wide Scale

My research focuses on developing genome-scale, mechanistic models of metabolism and cellular physiology. I approach these problem using large multi-omics datasets that provide complementary views of the systems’ states. Using such datasets, allows me to break network-level complexity into simpler sub-questions where novel connections among features can be discovered. To discover these connections I utilize and extend techniques from statistics and machine learning.

A summary of my employment, education and research can be found in my curriculum vitae, while my resume provides a more succinct summary work history.

Data Science and Leadership at Calico Life Sciences

In 2017, I joined Calico Life Sciences, an Alphabet subsidiary focused on improving our understanding of aging and advancing the treatment of diseases of aging. As a Data Scientist, I have developed infrastructure for understanding aging at a systems-level by profiling the molecular state of aging organisms and disentagling the causal inter-dependencies among molecular traits. To generate ever larger and more comprehensive profiles, I have worked closely with experimental researchers, particularly those working on mass spectrometry, to streamline and advance our internal workflows.

In January of 2018 my role expanded into managing a team of data scientists. Adopting this role, I received management training so that I could better understand my role as a manager and a leader in our organization. I soon put this training to work, by contributing to a re-organization of the Computing team, and by representing my team to the executive team and in all-company formats. As a manager, I’ve hired six researchers into diverse, specialized roles, and helped them maintain effectiveness amidst a problem-rich environment. As a leader, I’ve helped individual’s align their work with their personal, and the companies long-term goals, and initiated new directions, of increasingly strategic importance, for the company.

Published papers from my time at Calico include:

  • Sean R. Hackett, Edward A. Baltz, Marc Coram, Bernd J. Wranik, Griffin Kim, Adam Baker, Minjie Fan, David G. Hendrickson, Marc Brendl, R. Scott McIsaac. Learning causal networks using inducible transcription factors and transcriptome-wide time serie. Molecular Systems Biology, 16 (3), 2020.
  • Sam S. Schoenholz, Sean Hackett, Laura Deming, Eugene Melamud, Navdeep Jaitly, Fiona McAllister, Jonathon O’Brien, George Dahl, Bryson Bennett, Andrew Dai, Daphne Kohler. Peptide-spectrum matching from weak supervision. ArXiv.

PhD and Postdoc at Princeton

After finishing my PhD I did a short PostDoc with John Storey.

My research focused on:

  • Exploratory analysis of high-dimensional sports data: see
  • Modeling competitive growth of yeast strains that are mosaics of two parental backgrounds (BYxRM) using BarSeq.

Published papers from this work are:

  • Sean R. Hackett and John D. Storey. Mixed Membership Martial Arts: Data-Driven Analysis of Winning Martial Arts Styles. Proceedings of the Sloan Sports Conference, 2017.

I completed my PhD in the program of Quantitative and Computational Biology (QCB) at Princeton University, where I was advised by Josh Rabinowitz.

My research primarily focused on:

  • Integrating multiple types of high-throughput metabolic data (proteins, metabolites and fluxes) using biochemically motivated non-linear reaction equations. Through model comparison of these reaction equations, I was able to predict allosteric regulation. Subsequent experimental work verified three instances of novel regulation of major enzymes.
  • Identifying reoccuring metabolic changes in primary human tumors by statistical analysis of metabolomics data.
  • Understanding how protein levels are determined quantitatively through regulation at the transcriptional and post-transcriptional level.
  • Developing statistical methods for integrating mass spectrometry-based measurements from multiple peptides into protein-level summaries

Published papers from this work are:

  • Sean R. Hackett, Vito R.T. Zanotelli, Wenxin Xu, Jonathan Goya, Junyoung O. Park, David H. Perlman, Patrick A. Gibney, David Botstein, John D. Storey, and Joshua D. Rabinowitz. Systems-level analysis of mechansims controlling yeast metabolic flux. Science, 345, 2016.
  • Jurre Kamphorst, Michel Nofal, Cosimo Commisso, Sean R. Hackett, Wenyun Lu, Elda Grabocka, George Miller, Jeffrey Drebin, Matthew Vander Heiden, Dafna Bar-Sagi, Craig Thompson, Josh Rabinowitz. Human pancreatic cancer tumors are nutrient poor and the tumor cells actively scavenge extracellular protein. Cancer Research, 75, 2015.
  • Robin Mathew, Sinan Khor, Sean R. Hackett, Joshua D. Rabinowitz, David H. Perlman, and Eileen White. Functional Role of Autophagy-Mediated Proteome Remodeling in Cell Survival Signaling and Innate Immunity. Molecular Cell, 55(6), 2014.
  • Jeffrey S. Breunig, Sean R. Hackett, Joshua D. Rabinowitz, Leonid Kruglyak. Genetic Basis of Metabolome Variation in Yeast. PLoS Genetics, 10(4) e1004142, 2014.
  • Cosimo Commisso, Shawn M Davidson, Rengin G Soydaner-Azeloglu, Seth J Parker, Jurre J Kamphorst, Sean Hackett, Elda Grabocka, Michel Nofal, Jeffrey A Drebin, Craig B Thompson, Joshua D Rabinowitz, Christian M Metallo, Matthew G Vander Heiden, and Dafna Bar-Sagi. Macropinocytosis of protein is an amino acid supply route in Ras-transformed cells. Nature, 2013.

Undergraduate and post-baccalaureate research at Cornell

After graduating from Cornell, I worked as a research specialist in the laboratory of Andy Clark for four years. During that time I worked on a quantitative/population genetics project aimed to determine how heterogeneous metabolic phenotypes were inter-related and dependent on expression and genetic variation using D. melanogaster as a model.

While this project is still ongoing, several papers have been published:

  • Jennifer K. Grenier, J. Roman Arguello, Margarida Cardoso Moreira, Srikanth Gottipati, Jaaved Mohammed, Sean R. Hackett, Rachel Boughton, Anthony J. Greenberg, and Andrew G. Clark. Global Diversity Lines - A five-continent reference panel of sequenced Drosophila melanogaster strains. G3, 5(4), 2015.
  • Anthony J Greenberg, Sean R. Hackett, Lawrence G Harshman, and Andrew G Clark. Environmental and genetic perturbations reveal different networks of metabolic regulation. Molecular Systems Biology, 7:563, 2011.
  • Anthony J Greenberg, Sean R. Hackett, Lawrence G Harshman, and Andrew G Clark. A Hierarchical Bayesian Model for a Novel Sparse Partial Diallel Crossing Design. Genetics, 185(1):361373, June 2010.

During my undergraduate research with Teresa Gunn at Cornell I investigated the genetic basis of cardiac arrhythmias in dogs:

  • S R Hackett, S W Jung, E Kirkness, J Cruickshank, K L Vikstrom, N S Moise, and T M Gunn. Identification and characterization of canine microsatellite markers in cardiac genes. Animal Genetics, 38(1):8991, February 2007.
  • W Liu, S R Hackett, J Cruickshank, K L Vikstrom, N S Moise, and T M Gunn. Canine microsatellites associated with genes implicated in cardiac development and function. Animal Genetics, 37(1):8788, February 2006.