The lab has three areas of research interest: (1) microbial evolutionary history at deep time scales, (2) evolutionary genetic mechanisms underlying major genomic changes, and (3) microbial population ecology. We employ a variety of genomic and computational methods to address important questions in these areas.

Microbial evolutionary history at deep time scales

(1) We developed a fossil-independent molecular dating method based on the unbiased spontaneous mutation rate and subsequently estimated the evolutionary time of the Roseobacter group, the most abundant bacterial lineage in the coastal environments and a major player in marine carbon and sulfur cycles (Sun et al. 2017. The ISME Journal).

(2) In a study of Flavobacteriaceae, one of the major bacterial groups responsible for global polysaccharide and peptide degradation, we predicted three major ocean-to-land transitions in their evolutionary history, which were associated with repeated losses of marine signature genes and repeated gains of non-marine adaptive genes (Zhang et al. 2019. Environmental Microbiology).

(3) Thaumarchaeota, an archaeal group known to drive ammonia oxidation in both marine and terrestrial environments, experienced an ancient innovation in the terrestrial habitats from the non-ammonia-oxidizing ancestors to the ammonia-oxidizing descendants, followed by expansions to the photic shallow ocean and then to the dark deep ocean. We showed that these transitions were associated with substantial changes in the metabolic potentials, and predicted that the timing of these transitions matches well with the oxygenation events on these major habitats (Ren et al. 2019. The ISME Journal).

(4) Members of the Rhizobiales include those capable of nitrogen fixation in nodules as well as pathogens of animals and plants. We categorized all members of the Rhizobiales with available genome sequences into four lifestyle groups (nodule association, plant association, animal association, and free-living lifestyle), and our careful molecular clock analyses inferred that the emergence of host-associated lifestyles in the Rhizobiales broadly coincided with the rise of their eukaryotic hosts. In particular, we showed that the first nodulating rhizobia lineage arose from either Azorhizobium or Bradyrhizobium 150-80 million years ago (Mya), concurrent with the emergence of legume plants (Wang et al. 2020. mSystems).

Evolutionary genetic mechanisms underlying major genomic changes

One interest of the lab is the evolutionary mechanisms giving rise to highly reduced genomes commonly found in free-living marine bacterioplankton including Prochlorococcus, SAR11 and SAR86 among many other lineages. The selective explanation has been the dominant theory to explain genome reduction in bacterioplankton lineages. While there is ample evidence showing that streamlined lineages are likely under strong selection in today’s ocean, there was no evidence showing selection was similarly efficient in the ancient past, in which genome reduction occurred and the geochemical condition differed considerably from that of today’s ocean. We sought out to test the selection efficiency underlying the ancient events using the following procedure. First, nonsynonymous substitutions were classified into conservative and radical changes, which lead to replacements of physicochemically similar and dissimilar amino acids, respectively. Next, a population genomic approach was designed to demonstrate that radical changes are more likely to be deleterious than conservative changes. Finally, a method was developed that compares the rate of radical nonsynonymous substitution to the rate of conservative nonsynonymous substitution while incorporating the codon frequency and amino acid frequency in calculating the rate. Using this method, we showed accelerated fixation of radical changes at the deep branch when massive DNA losses occurred compared to their sister lineages in Prochlorococcus and SAR86, but not in SAR11. Since radical changes represent a more deleterious type of mutations, an excess of the radical changes suggests chance fixation of deleterious mutations driven by random genetic drift. This is the first time that the hypothesis of genetic drift driving genome reduction of some of the most important free-living marine bacterioplankton lineages was tested (Luo et al. 2017. Nature Microbiology).

Microbial population ecology

(1) We isolated 279 closely related Roseobacter strains from nearby ecosystems dominated by a coral species and a brown algal species, respectively. We showed that pseudogenes contribute to ~16% of the accessory genomes of these strains and that many pseudogenization events are correlated with ancestral niche shifts.  These results suggest that gene loss via pseudogenization is an important mechanism contributing to Roseobacter ecological diversification (Chu et al. 2020. The ISME Journal).

(2) We isolated 16 pelagic Roseobacter strains, which vary at only a few thousand nucleotide sites across the genomes, from an undisturbed coastal water sample in which microenvironments are preserved. We showed that this population has been divided into two emergent species at two chromosomal regions and one plasmid region of the core genomes. The underlying mechanism driving speciation is that these three regions were subjected to allelic replacement via homologous recombination with an external Roseobacter species. We further inferred that the initial recombination events at each of these three regions were involved in long DNA segments, followed by fine-tuning at some loci within these regions by recombining with other external species. Functional analyses of these three regions and physiological assays suggest that microenvironments likely resulting from trophic interactions with phytoplankton drive the pelagic Roseobacter evolution (Wang et al. 2020. The ISME Journal).

(3) In collaboration with Professor Ferdi L. Hellweger, we quantify the contribution of multiple factors to the change of GC content of Ruegeria pomeroyi DSS-3 using an individual-based genome-scale model. The model simulates 2 × 108 cells. Each cell has a whole genome subject to base-substitution mutation and recombination, which affect the C and N requirements of DNA and amino acid pools. Nonsynonymous changes are functionally deleterious in general. Together these factors affect the growth and fitness under C and N limiting conditions. Simulations show that experimentally determined mutation bias towards GC in DSS-3 is not sufficient to build this GC-rich genome, and that DSS-3 and its ancestors have been evolving in environments primarily limited by carbon (Hellweger et al. 2018. The ISME Journal).

(4) We performed a comparative genomic analysis of a SAR11 population sampled from surface oceans and another from freshwater habitats. We identified a strikingly greater frequency of GC > AT changes at synonymous sites in the marine population compared to the freshwater population. Further analyses excluded mutational pattern, recombination rate, population history, and salinity change as possible explanations. Because nitrogen is more limiting in ocean waters compared to the freshwater systems and because a G:C pair uses one more nitrogen atom than an A:T pair, we concluded that selection for reducing nitrogen requirements drives more frequent change of GC > AT in the marine SAR11 population (Luo et al. 2015. Molecular Biology and Evolution).