Posted By: HGG Advances
Each month, the editors of Human Genetics and Genomics Advances interview an early-career researcher who has published work in the journal. This month we check in with Matt Bailey to discuss his paper “Intersection of rare pathogenic variants from TCGA in the All of Us Research Program v6.”
HGGA: what motivated you to start working on this project
![Matthew Bailey, PhD](https://www.ashg.org/wp-content/uploads/2025/02/Photo-2025-Matt-Bailey.png)
MB: Being an early-stage investigator is challenging. You enter a new department with existing models of success and expectations of establishing an independent lab. However, having completed my graduate and post-graduate training in highly collaborative environments, e.g., working closely with analysis working groups in The Cancer Genome Atlas (TCGA), I questioned my place in my department and whether my collaborative science was meaningful enough to keep my job. So, I tried to work alone for the first couple of years to prove my “independence.” However, with a transition to an early-stage investigator at a teaching institution, I found myself overwhelmed with new challenges managing the messy middle of exceptional teaching and running a new research program. So, I returned to my strengths. I opened my mouth and tried to build internal collaborations. After a few meaningful interactions with Mary Davis, PhD, we discovered similar academic interests. We are both passionate about improving human health by making sense of large genomic datasets. With some ideas in mind, we scheduled a late-night hackathon with our undergraduate students to dive into the All of Us Workbench hosted on Google Cloud. I leveraged my experience with TCGA; she leveraged her phenotype and population analytics expertise. With exceptional undergraduate students, we provided some preliminary genomic assessments about rarer pathogenic variants in cancer in All of Us.
HGGA: What about this paper/project most excites you?
MB: Taking a bit from the manuscript, many institutions at NIH, including the National Human Genome Research Institute (NHGRI) and the National Cancer Institute (NCI), contributed millions of dollars and countless hours to build one of the most extensive cancer datasets the world had ever seen. It provided us with exciting insights and an unprecedented gaze into the complexity and heterogeneity of cancer. But do these findings translate to other datasets? Do the mutations we identified in TCGA impact other diseases? Can and should we use All of Us to answer cancer-related questions? How many samples will we need to make new discoveries, and what are those limits? This manuscript is our first attempt to address these questions using a relatively small subset of what will become a 1M genome dataset.
HGGA: What do you hope is the impact of this work on the human genetics community?
MB: I hope for a lot of things that I can’t yet see. I hope this work opens the door to more cancer researchers who want to use the All of Us genomics data. I hope our work lowers the barrier of entry for those who want to study cancer using this database (please ask, we’re happy to share our workflows). I hope this work upholds the All of Us Research Program’s mission to create an improved data-sharing model. I hope this work helps move genomics research out of the strong silos that exist in the human genetics research community. And, I hope this work inspires young scientists, like the undergraduates who participated in this work, to be brave and to do hard things.
HGGA: What are some of the biggest challenges you’ve faced as a young scientist?
MB: As a young graduate student, I struggled to balance my time between using learned skills to address questions or learning new skills to address scientific questions. This was especially true in bioinformatics and computer science. For example, I had learned the Perl scripting language before graduate school, but Python was rising in popularity. I knew I could solve a computational problem with Perl but felt pressured to finish tasks quickly. At the same time, I also thought I would be “left behind” if I didn’t pick up Python. I often felt bad for taking time to learn new skills because I wasn’t applying what I already knew to the problem. I had to learn, and I’m still learning, as a young scientist, that developing new skills is always worth the investment. I hope I never stop learning. I want to use my skills to solve our society’s challenging problems.
HGGA: And for fun, what is one of the most fascinating things in genetics you’ve learned about in the past year or so?
MB: This last year, a rotation student in my lab introduced me to the Ig Nobel Prize, a satirical, almost theatrical presentation of unconventional science that makes you think. While researchers don’t yet understand the underlying genetic mechanism, one award went to work by Felipe Yamashita and his work on “Plants that Imitate Plastic.” The Boquila trifoliolata vine has been shown to mimic the leaves of host trees that it climbs. Yamashita’s group showed that the vine also mimics a plastic host or neighbor, suggesting the plant can “see.” This is wild to me and helps me appreciate the diversity of our planet and the central role death and genetic variation have played in creating the beauty we get to observe today.
Matthew Bailey, PhD is an assistant professor in the Department of Biology at Brigham Young University.