We are funded by the National Science Foundation (DEB-1442148 & DEB-1442113) to determine "The Making of Biodiversity Across the Yeast Subphylum." Over the next five years, we will sequence and analyze the genomes of all known yeast species from the subphylum Saccharomycotina to study the evolution of their diverse metabolic and ecological functions.

Yeasts of the ancient fungal subphylum Saccharomycotina inhabit every continent and every major aquatic and terrestrial habitat, and yet, little is known about the place that most species fill in nature. A key factor of yeast ecological dominance is their impressive diversity of resource utilization abilities and strategies. This project seeks to understand the rise and diversification of yeasts by deciphering the history of functional diversification that is written in their genomes.

Among the yeasts, the bread, beer, and wine yeast Saccharomyces cerevisiae is the best known of these unicellular fungi and is also among the chief models of molecular genetics research. In contrast to this

inveterate fermenter, the metabolisms of the other ~1,000 known species in this subphylum vary widely. Several are emerging pathogens, some can produce oil, many can metabolize xylose (the second most abundant monosaccharide in woody plant material), and most actually prefer cellular respiration, instead of fermentation. Some traits have evolved a handful of times, while others have evolved independently dozens of times, providing rich systems for the study of historical contingency and convergent evolution, respectively. The yeasts are more genetically and metabolically diverse than the vertebrates, but the genomic and ecological functions of most species remain unexplored.

To illuminate the genetic and ecological factors that drive yeast diversification and generate biodiversity, we will create the first comprehensive catalog of genetic and functional diversity for any high-level taxonomic rank by generating high-quality and richly annotated genome sequences for all known yeast species, inferring their definitive phylogeny and taxonomy, and tracing the evolution of genome content and traits. Overlaying quantitative physiological data will enable a broad integration of genetic and ecological functions across yeast history. The gain and loss of genes and traits will be correlated to each other, ecological niches, and rates of diversification. This project will test the relative predictive values of phylogenetic history, genome content, and physiology for traits that are relevant to niche, pathogenicity, and biotechnology. We will create a hierarchical, integrated web-based interface for researchers, clinicians, teachers, and students to explore genome sequence and trait data in an explicit tree-thinking framework that incorporates a stably revised taxonomy.

Despite the importance of many yeasts to biotechnology and biomedical research, many species are represented by a single isolate, and dozens of new species are described every year. The Yeast Exploration and Analysis Science Team (YEAST) will train high school and undergraduate students in microbiology by engaging them in genuine biodiversity research to isolate new genetic and taxonomic diversity from the wild, including samples submitted by citizen scientists. Metagenomic and genome-based approaches will be incorporated to improve computational training and lay the groundwork for genomics to become a central and integrated part of taxonomic and ecological research in the yeasts and beyond.