Genomic Prediction – Genes to Genomes https://genestogenomes.org A blog from the Genetics Society of America Thu, 26 Oct 2023 17:54:07 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.2 https://genestogenomes.org/wp-content/uploads/2023/06/cropped-G2G_favicon-32x32.png Genomic Prediction – Genes to Genomes https://genestogenomes.org 32 32 Meet early career scientists working in genomic prediction https://genestogenomes.org/early-career-scientists-working-in-genomic-prediction/ Wed, 10 Apr 2019 14:10:56 +0000 https://genestogenomes.org/?p=44181 Learn about some of the work that graduate students, postdocs, and early career faculty are contributing to the field of genomic prediction. Since 2012, the GSA Journals have published a series of papers focused on genomic prediction. We’re excited to announce a newly-organized Series page that makes it easy to navigate the extensive collection of…]]>

Learn about some of the work that graduate students, postdocs, and early career faculty are contributing to the field of genomic prediction.

Since 2012, the GSA Journals have published a series of papers focused on genomic prediction. We’re excited to announce a newly-organized Series page that makes it easy to navigate the extensive collection of genomic prediction papers published at GENETICS and G3. We’d also like to introduce you to a few early career scientists currently working in the field and to give you a glimpse into the types of research they do.


 

 

 

 

 

Antoine Allier
Graduate student, INRA
Le Moulon, Alain Charcosset Lab

“My current research aims at optimizing the management of genetic diversity in breeding programs using genomic selection. In particular, I am working on the prediction of cross variance and genetic diversity for optimal cross selection.”


 

 

 

 

 

Matt Baseggio
Graduate student, Cornell University
Michael Gore Lab

“Large proportions of the US population do not meet the daily-recommended intake of several vitamins and nutrients. My research is trying to improve the nutritional quality of sweet corn—the third most consumed vegetable in the US. I conducted genome-wide association studies to identify genes and favorable alleles responsible for quantitative variation of kernel carotenoid (provitamin A, lutein, zeaxanthin), tocochromanol (vitamin E), and nutrient (iron and zinc) levels in a sweet corn diversity panel. I am also developing and validating marker-based prediction models to convert locally adapted sweet corn germplasm to dark orange kernel with high vitamin and nutrient content.”


 

 

 

 

 

Anthony Findley
MD/PhD student, Wayne State University
Roger Pique-Regi and Francesca Luca Labs

“My research focuses on gene regulation in varying environmental contexts and cell types. I integrate gene expression and chromatin accessibility data from cells treated with a variety of hormones, environmental contaminants, drugs, and metals to identify regulatory elements which modulate cellular response to each condition. I am particularly interested in linking these in vitro exposures with complex traits and understanding how genetic variation in environmentally responsive regulatory elements can be used to predict disease susceptibility.”


 

 

 

 

 

Margaret Krause
Graduate student, Cornell University
Mark Sorrells and Michael Gore Labs

“My research focuses on the integration of high-throughput phenotyping and genomic selection in plant breeding. Advances in remote sensing have enabled plant breeders and geneticists to collect an extensive amount of phenotypic information on large numbers of individuals throughout their growth and development. Integrating these traits into genomic selection has the potential to increase the rate of genetic gain in crop plants.”


 

 

 

 

 

Jhonathan Pedroso Rigal dos Santos
Graduate student, University of São Paulo and Cornell University
Michael Gore Lab

“The ability to connect information between traits over time allow Bayesian networks to offer a powerful probabilistic framework to design genomic prediction models. In our research, we phenotyped (plant height time series, biomass) and genotyped (100,435 SNPs) a diverse panel of 869 sorghum lines. We developed the models Bayesian Network (BN), Pleiotropic Bayesian Network (PBN), and Dynamic Bayesian Network (DBN). For benchmarking, we used multivariate GBLUP models. The DBN model approached the same accuracy as the reference model and allowed to compute probabilistic indexes to identify optimal time points before the end of the season for earlier indirect selection.”


 

 

 

 

 

Fabio Morgante
Postdoc, University of Chicago
Yang Li and Matthew Stephens Labs

“I am interested in statistical and quantitative genetics in a variety of species, with a special focus on prediction of complex traits. While my initial work involved analyzing livestock data, I quickly transitioned to using Drosophila melanogaster as a model system to develop statistical models and analysis strategies to predict complex traits more accurately, by leveraging multiple layers of data (e.g., genomic, transcriptomic, metabolomic). Recently, I have become interested in human genetics and have been working on a method that exploits the sharing of eQTL effects among different tissues to increase the prediction accuracy of expression levels from genotype data.”


 

 

 

 

 

Ivone de Bem Oliveira
Postdoc, University of Florida
Patricio Muñoz Lab

“My research has been focused on the intersection between breeding and genomics, particularly in developing solutions to improve the selection process for polyploid breeding. Our pioneering research has proven the feasibility of genomic prediction for blueberry, enabling reductions in breeding cycle times and increasing genetic gain. Now we are optimizing the relationship between genotyping cost and model accuracy for an economically feasible application of genome prediction for blueberry. I am evaluating the effect of number of markers, sequencing depth, and training population on phenotype prediction. The benefits and pipeline described in our studies can be applied to other polyploid species.”


 

 

 

 

 

Blaise Ratcliffe
Postdoc, University of British Columbia
Yousry El-Kassaby Lab

“My research focuses on the integration and use of genomic information in conifer tree improvement programs. Genomic selection tools have the potential to accelerate rates of genetic gain for complex, quantitative traits through early prediction of phenotypes and increased selection intensity. These new tools enable breeding programs to respond rapidly to the changing market demands of forest products as well as emerging abiotic and biotic threats.”


 

 

 

 

 

Palle Duun Rohde
Postdoc, Center for Quantitative Genetics and Genomics at Aarhus University
Peter Sorensen Lab

“My work focuses on the accurate prediction of individual disease risk or the response to medical treatment is important for the development of precision medicine. To successfully advance in precision medicine a better understanding of the genetic architecture of human complex traits and disease is required. In my research, I focus on statistical genetic methods for integration of different types of data to achieve a better understanding of the genetic basis of complex traits and diseases. Currently, my work is focused on how to leverage multi-layered phenotypes and multi-layered molecular data to improve current prediction models, in particular with respect to treatment response.”


 

 

 

 

 

Nicholas Schreck
Postdoc, University of Mannheim
Martin Schlather Lab

“I focus on the theoretical analysis of mixed linear models with special focus to the estimation of the additive genomic variance. I also investigate coefficients of determination in mixed linear models with the aim of efficient variable selection for high-dimensional genomic data sets.”


 

 

 

 

 

Gregory Way
Postdoc, Broad Institute of MIT and Harvard
Anne Carpenter Lab

“My PhD focused on developing supervised and unsupervised machine learning approaches to extract knowledge from large publicly available gene expression data sets. I have developed approaches to isolate gene expression signatures in tumors including identifying Ras pathway activation and TP53 inactivation signatures. In my postdoc, I will focus on extracting knowledge from large biomedical imaging data sets. My goals include developing methods that measure subtle responses to drug treatments in cell lines and to integrate imaging and gene expression data to provide additional views towards solving difficult biomedical problems.”


 

 

 

 

 

Yvonne Wientjes
Postdoc, Wageningen University
Research Animal Breeding and Genomics

“Genomic selection has revolutionized artificial selection in livestock populations. Compared to classical selection, genomic selection is more accurate, focusses more on genes with large effects and ignores rare genes or with small effects. Therefore, genomic selection has likely increased the change in allele frequencies of genes over generations and may have changed the effects of those genes when non-additive effects are present. I aim to investigate how fast genomic selection methods change the genetic architecture of traits, i.e. the allele frequencies and effects underlying the trait. This will provide information on whether current selection methods limit the potential for long-term genetic improvement.”


 

 

 

 

 

Alencar Xavier
Research Scientist, Corteva Agrisciences and Purdue University
David Habier Lab

“My work on statistical genetics is focused on genomic-assisted breeding with emphasis on theoretical and computational aspects of data-driven plant breeding, such as modeling, prediction and selection using various sources of information. My research regards the development and implementation of new quantitative methods using mixed models, Bayesian methods and machine learning, along with high-performance computing.”


 

 

 

 

 

Robert Baker
Assistant Professor, Miami University

“I study organismal evolution of plant form and function. To do so, I examine the connection between genotypes and phenotypes throughout development and across environments at the intraspecific level. My research integrates quantitative genetics, genomics, transcriptomics, anatomy, morphology, and physiology in plants from natural, model, and crop systems. I use these data in part to predict novel, non-linear developmental phenotypes based on genotypes. My work has implications for understanding natural biodiversity, conservation and restoration, and improving agricultural sustainability.”


 

 

 

 

 

Helena Oakey
Senior Research Fellow, University of Adelaide
The Biometry Hub, Olena Kravchuk Lab

“My research interests cover the development and improvement of statistical genetic methodology, including genomic selection and association mapping to account for factors unique to agronomic trials such as replication, multiple phases (laboratory and field), treatments and environment.”

]]>
“Predicting” the future: how genomic prediction methods anticipated technology https://genestogenomes.org/predicting-the-future/ Tue, 09 Apr 2019 12:00:02 +0000 https://genestogenomes.org/?p=44178 A landmark paper published in GENETICS founded the field of genomic prediction before the requisite technology was available. When a new technology is developed, it can allow scientists to make great strides in addressing longstanding questions. Occasionally, however, researchers think so critically about a knowledge gap in their field that they’re able to propose a…]]>

A landmark paper published in GENETICS founded the field of genomic prediction before the requisite technology was available.


When a new technology is developed, it can allow scientists to make great strides in addressing longstanding questions. Occasionally, however, researchers think so critically about a knowledge gap in their field that they’re able to propose a new methodology that anticipates the technology needed to make it a reality.

This is precisely what Theo Meuwissen, Ben Hayes, and Mike Goddard accomplished with their 2001 paper Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps. In it, they laid out a framework for predicting breeding values from genome-wide marker information, using simulated data to compare different approaches. The catch? There wasn’t a way to do what they were proposing—the technology didn’t exist yet.

Despite this seemingly major drawback, the authors were able to successfully use theory and simulated data to propose methods that would one day prove to revolutionize animal breeding strategies.

“In retrospect, the paper was a bit of a thought piece,” says Hayes. “Imagine if we could do this: what would it look like?”

The central goal of selective animal and plant breeding is increasing the genetic gain—that is, enhanced performance—of economically important traits. This was classically achieved by meticulously recording individuals’ phenotypic information in a population and using these records to estimate breeding values and select the best breeders for establishing the next generation. As the genomic era began to bloom toward the end of the 20th century, researchers began to incorporate genotype data into their selection strategies.

“The prevalent attitude was to try and map individual quantitative trait loci (QTLs) and then incorporate them into decisions about selection of animals,” according to Goddard.

But most of the traits in question were not associated with a small number of genes or markers, as originally anticipated. Instead, the relevant traits were likely controlled by many genes of small effects—hundreds or even thousands of genes, in fact. Existing methods were geared toward mutations of large effect, which the field was discovering weren’t likely to be found.

As the complexity of the genomic architecture underlying these traits was becoming clearer, genotyping technologies were becoming more advanced.

“It had been predicted that we would get dense marker data, but we didn’t know what to do with it. We were trying to figure out what to do if we were able to get dense marker data in a cost-efficient way,” says Meuwissen.

They explored a genome-wide approach to predict breeding values without mapping specific QTLs. They needed a high density of markers across the genome for this type of approach to work, but since that kind of real data didn’t exist yet, they simulated a genome and marker set and tested a number of statistical methods. After comparing linear regression, Best Linear Unbiased Prediction (BLUP), and multiple Bayesian methods (termed BayesA and BayesB), they concluded that “selection on genetic values predicted from markers could substantially increase the rate of genetic gain in animals and plants, especially if combined with reproductive techniques to shorten the generation interval.”

They published their work in GENETICS, noting presciently that “the advent of DNA chip technology may make genotyping of many animals for many of these markers feasible (and perhaps even cost effective).” But since SNP chips weren’t yet in the hands of researchers, the paper didn’t spark an immediate revolution in quantitative genetics or animal breeding. Meuwissen, Hayes, and Goddard had founded the field of genomic selection (also termed genome-wide prediction), but the full potential of their findings wouldn’t be realized for a number of years.
“The paper really sat in the cupboard until the technological advance came along,” says Hayes.

Thankfully, they didn’t have to wait too long: by the end of the decade, SNP chips—which allow simultaneous genotyping of thousands of markers—were available for major livestock species. And with the availability of SNP chips came an explosion of interest in the paper that founded genomic selection.

Citations to Meuwissen et al. (2001) according to PubMed and Google Scholar. From de Koning (2016).

In the nearly two decades since, the field has grown and changed in a variety of ways. For one, genotyping technology has continued to improve.

“It started off being a relatively small number of SNPs (~10,000 on the first bovine chip), and now you can get 600,000. SNP tech came onstream and rapidly advanced,” notes Goddard.

Additionally, these methodologies have also been applied more widely than livestock breeding—most notably to plant breeding and to human genetic studies of disease risk prediction. For more insight into the similarities and differences in how the methods are applied in different settings, see the new review published this month in GENETICS by Naomi Wray and colleagues.

What’s next for genomic prediction?

Researchers are still working on the best way to use whole genome sequencing (WGS) data instead of SNP chip data—though it’s now easier and cheaper than ever to sequence entire genomes, there hasn’t been much advantage to using WGS data over SNP data to date.

There are also challenges related to applying genomic prediction across breeds.

“Doing genomic prediction across breeds really doesn’t work well at the moment,” explains Hayes. “This is a problem because, in some breeds, it’s cost prohibitive to build the populations needed to drive genomic selection. There’s a lot of work going on about borrowing information across breeds.”

And as genomic prediction is being implemented widely and in many different species, it’s important for breeders to keep an eye on genomic diversity within their populations.

“We’re getting increasingly effective tools, but if we run out of diversity, we won’t be able to maintain the selection response we see today into the future,” notes Meuwissen.

Through the intervening years, the methods laid out in the 2001 paper have stood the test of time, with BayesB remaining at the forefront of genomic prediction. The field continues to grow and develop, moving into new species and honing the technologies—goals aided by the Genomic Prediction series launched in 2012 at the GSA Journals. Since then, GENETICS and G3 have collected an exciting body of work, encouraging the exploration of methods and the sharing of data to advance the field.

Genomic prediction is a striking demonstration of how science needn’t be limited by existing technology. In some cases, theoretical advances can even predict the future and help us make the most of technological advance.

CITATIONS

Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps
T. H. E. Meuwissen, B. J. Hayes and M. E. Goddard
GENETICS April 2001, 157 (4): 1819-1829.
http://www.genetics.org/content/157/4/1819

Meuwissen et al. on Genomic Selection
Dirk-Jan de Koning
GENETICS May 2016, 203(1): 5-7.
https://doi.org/10.1534/genetics.116.189795
http://www.genetics.org/content/203/1/5

]]>