Selfish Elements

The term parasite typically refers to an organism participating in a sustained inter-species relationship (symbiosis) in which this organism benefits at the cost of another. Parasites engaging in extremely asymmetric relationships are often bestowed the higher title of pathogen.

This asymmetry makes pathogens selfish, often so selfish that hosts are under strong evolutionary pressure to evolve defenses against the pathogen or risk population collapse. In response, the pathogen evolves to evade these new host defenses, giving rise to a conflict which rapidly generates genomic novelty and reveals the rules governing a stable ecology.

We endeavor to learn these rules for diverse pathogens ranging from large multicellular eukaryotes down to the smallest RNA viruses and establish conserved features. Relaxing our definition of organism, we may zoom in even further to consider this same conflict playing out within an individual genome among selfish genetic elements living in the blurry boundary of life. In turn, we believe we can begin to better understand the propagation of selfish elements in the complex nonliving systems that compose our public health infrastructure, from social media to government policy.

Comparative Genomics

The growth in the number of nucleotides in publicly available genome repositories beats Moore's Law. Together, advances in computer engineering and next-generation sequencing technologies enable computational groups like ours to study a vast diversity of organisms.

We have a special interest in viruses, and RNA viruses in particular. Almost all "priority pathogens" with recognized pandemic potential are viruses and the majority of zoonotic mammalian viruses are RNA viruses. Viruses evolve fast providing the opportunity to correlate genomic variation to environmental change as it happens in real time.

A typical workflow involves aligning homologous sequences, inferring phylogeny, and estimating ancestral reconstructions to establish mutations. We focus on understanding the biology rather than software development but we build our own tools as necessary.

Mathematical Modelling

The other key element of our work is the construction of mathematical models to explain and hopefully predict evolutionary dynamics. Evolution is inherently random and yet while the combinatorial space of all possible sequences long enough to make even the smallest genomes is incomprehensibly vast, under the same selective pressures, the same mutations are observed to repeatedly emerge.

One area of particular interest is epidemic modelling. In this direction, we evaluate the predicted effects of public health intervention, including the long term impact of intervention on pathogen evolution. This most often begins with a system of ordinary differential equations (curves: S E I R model) and reassessed if complex spatial or stochastic effects are substantial over the timescale of interest.

We are very applied mathematicians and are not experts in a particular discipline but instead seek to learn new techniques best suited to our biological interests. We often employ large scale simulations relying on straightforward numerical methods, but are always excited to apply new analytical approaches.

Ongoing Projects

In the face of a phage, how optimistic should a bacterium be?

Bacteria maintain diverse immune machinery to protect against phage infection. If that fails, an infected individual may undergo programmed cell death (PCD), an altruistic behavior, reducing the probability that the infection will spread within the community. Building on our prior work focusing on understanding the features which govern the optimal strategy for somatic damage mitigation, we seek to understand how bacteria decide when to take one for the team.

How do you decide if two proteins are "the same"?

Modern sequence alignment tools can establish homology among highly divergent proteins. The resulting deep alignments may then be organized into clusters given an appropriate metric over the sequence space. Percent identity with respect to a reference is often used to designate just two clusters - "the same" and "different". What percent identity threshold represents functional divergence, however, varies based on the genomic and ecological context. Using information theory, we are building a pipeline to segment deep alignments into groups that constitute the most efficient (in bits) representation of the underlying sequence space. It is our hope that this threshold-free approach is able to reproduce manually curated designations of evolutionary divergence as well as predict new functional groups.

Should we be paying more attention to non-human cancer?

Addressing tumor evolution remains a major challenge for cancer treatment. Despite the availability of thousands of human cancer genomes through programs including TCGA and COSMIC, substantial patient heterogeneity makes predictions of individual responses to treatment noisy. Interactions between tumor mutations and germline variations remain poorly explored and even well characterized driver genes likely play additional, unknown roles in the process of metastasis. Cancer is also frequently observed in diverse non-human animals from clams to clydesdales but tumors from most species are rarely sequenced. This data could dramatically improve our understanding of the interactions between germline variations in driver genes present in other species and tumor mutations which, motivated by our prior work, we expect to overlap with the landscape of human tumors even for distant relatives. We are building a database to collect publicly available tumor genomes across species and hope to identify several organisms for which the observed cancer incidence is high and the germline variations in driver genes will be informative for human health.

How do generative text responses differ from expert answers to clinical queries?

Generative text algorithms trained using much of the internet provide a mechanism to access technical information that can reduce the barrier to entry for non-experts. Clinical queries are of particular interest as effective generative responses may improve access to healthcare and reduce health disparities but ineffective responses may promote confusion and mistrust in the healthcare system. We are collecting an ensemble of questions and answers from the CDC and the NIH and posing these questions as prompts to popular generative text programs. Conserved differences in the structure of these answers will reveal limitations in generative text algorithms which may motivate new safeguards. Conversely, the generative responses may demonstrate useful features which may be incorporated into public health messaging.

How is climate change impacting viral evolution?

Virus ecology and evolution is climate dependent and successful epi/pandemic prevention and response requires the incorporation of climate variables into epidemiological models and biostatistical workflows. Free, robust remote sensing data is made available through NASA; however, data accessibility remains a challenge for the epi/biostats community. As a part of the NIH Climate Change and Health Initiative we are building a web portal to help policy makers and public health practitioners utilize this data. In parallel, we are clustering metaviromes based on climate zone classification to determine climate-sensitive trends in viral abundance and evolutionary selection pressures.

Team

Principal Investigator Nash Rochman is an Assistant Professor in the CUNY SPH Department of Epidemiology and Biostatistics; an Institute for Implementation Science in Population Health Investigator; an NIH Special Volunteer; and an editor at Biology Direct. Nash was drawn to a career in biology to help distil complex and confusing data into predictive models for disease. He completed his undergraduate education at Bard College of Simon's Rock and Brown University and pursued his PhD advised by Sean Sun in The Johns Hopkins University Cell Biomechanics Lab. Nash went on to pursue a postdoctoral fellowship centered on pandemic viral evolution advised by Eugene Koonin in the NIH Evolutionary Genomics Research Group. Prior to joining CUNY, Nash was a Principal Investigator at the NIH in the Independent Research Scholar Program. Nash splits his time between NYC and DC where he lives with his wife Anita. In many locations, Nash can be found playing jazz trumpet. email publications LinkedIn

Senior Research Associate Peter Vlasov has over 20 years experience in computational biology. He completed his PhD in applied physics and mathematics from Moscow Institute of Physics and Technology. Peter began his scientific career in the area of structural bioinformatics at the Institute of Molecular Biology (Russia). After spending several years applying these techniques in the private sector for the drug-design company Algodign LLC, Peter returned to publicly funded research within the Center for Genomic Regulation (Spain) and the Institute of Science and Technology (Austria). Peter's current research focuses on the development of computational methods in evolutionary and systems biology. In addition to his research, Peter has made advances in computational biology through his contributions to educational initiatives for both university students and children designed to expand and diversify the biomedical workforce. Peter currently resides in Barcelona with his family where he can be found lifting heavy things before heading out to a gallery opening. email publications LinkedIn

Predoctoral Scholar Dmitry Biba's primary interests are evolutionary genomics and population genetics, with a specific focus on microbial phylodynamics and molecular evolution. Dmitry is a visiting fellow at the NIH in the Evolutionary Genomics Research Group (Koonin). His current work centers on exploring bacterial defense strategies against a broad range of adverse entities. Dmitry completed his undergraduate education in the Department of Evolutionary Biology at Moscow State University and went on to pursue his Master's degree at Skolkovo Institute of Science and Technology advised by Georgii Bazykin in the Evolutionary Genomics Lab. Dmitry and his wife Vasilisa both split their time between NYC and DC where they can be found organizing board games. email publications LinkedIn

MPH Scholar Sheetal Chowdhary is a practicing physician additionally engaged in both medical research and undergraduate teaching. Her prior work has involved diverse public health efforts ranging from infectious disease surveilance to opioid overdose safety management. Sheetal's current research interests are focused on the incorporation of modern data analytics methods leveraging artificial intelligence into clinical workflows. email LinkedIn

MPH Scholar Ben Jagt completed his pre-med undergraduate studies at the University of Minnesota. Seeking to shift his focus from improving clinical outcomes to population health, after graduation Ben hopes to find new ways to use "big data" to reduce health disparities. His current research focuses on increasing accessibility for underutilized public data. Ben lives in NYC with his wife where he can be found playing ultimate frisbee (he's a pro!) in the American Ultimate Disc League. email LinkedIn

MPH Scholar Lori Winter completed her undergraduate degree in Molecular and Cellular Biology at San Diego State University. She hopes to use genomic inference to improve public health outcomes. Lori’s current research focuses on understanding the role of horizontal gene transfer in gut microbiome stability. Lori lives in North Carolina with her fiancé and their Labrador Rocky (all Star Wars fans) where she can be found crocheting. email

Let's Meet!

We are always eager to discuss possibilities for collaboration. If you are interested in joining the group, please do not hesitate to email Nash. Opportunities for candidates at all career stages from high school students to senior research associates may be available. Positions are fully remote, or hybrid based in NYC or DC.