Thousands of new bacterial and archaeal species and higher-level taxa are discovered each year through the analysis of genomes and metagenomes. The Genome Taxonomy Database (GTDB) provides hierarchical sequence-based descriptions and classifications for new and as-yet-unnamed taxa. However, bacterial nomenclature, as currently configured, cannot keep up with the need for new well-formed names. Instead, microbiologists have been forced to use hard-to-remember alphanumeric placeholder labels. Here, we exploit an approach to the generation of well-formed arbitrary Latinate names at a scale sufficient to name tens of thousands of unnamed taxa within GTDB. These newly created names represent an important resource for the microbiology community, facilitating communication between bioinformaticians, microbiologists and taxonomists, while populating the emerging landscape of microbial taxonomic and functional discovery with accessible and memorable linguistic labels.
The status Candidatus was introduced to bacterial taxonomy in the 1990s to accommodate uncultured taxa defined by analyses of DNA sequences. Here I review the strengths, weaknesses, opportunities and threats (SWOT) associated with the status Candidatus in the light of a quarter century of use, twinned with recent developments in bacterial taxonomy and sequence-based taxonomic discovery. Despite ambiguities as to its scope, philosophical objections to its use and practical problems in implementation, the status Candidatus has now been applied to over 1000 taxa and has been widely adopted by journals and databases. Although lacking priority under the International Code for Nomenclature of Prokaryotes, many Candidatus names have already achieved de facto standing in the academic literature and in databases via description of a taxon in a peer-reviewed publication, alongside deposition of a genome sequence and there is a clear path to valid publication of such names on culture. Continued and increased use of Candidatus names provides an alternative to the potential upheaval that might accompany creation of a new additional code of nomenclature and provides a ready solution to the urgent challenge of naming many thousands of newly discovered but uncultured species.
Background
The chicken is the most abundant food animal in the world. However, despite its importance, the chicken gut microbiome remains largely undefined. Here, we exploit culture-independent and culture-dependent approaches to reveal extensive taxonomic diversity within this complex microbial community.
Results
We performed metagenomic sequencing of fifty chicken faecal samples from two breeds and analysed these, alongside all (n = 582) relevant publicly available chicken metagenomes, to cluster over 20 million non-redundant genes and to construct over 5,500 metagenome-assembled bacterial genomes. In addition, we recovered nearly 600 bacteriophage genomes. This represents the most comprehensive view of taxonomic diversity within the chicken gut microbiome to date, encompassing hundreds of novel candidate bacterial genera and species. To provide a stable, clear and memorable nomenclature for novel species, we devised a scalable combinatorial system for the creation of hundreds of well-formed Latin binomials. We cultured and genome-sequenced bacterial isolates from chicken faeces, documenting over forty novel species, together with three species from the genus Escherichia, including the newly named species Escherichia whittamii.
Conclusions
Our metagenomic and culture-based analyses provide new insights into the bacterial, archaeal and bacteriophage components of the chicken gut microbiome. The resulting datasets expand the known diversity of the chicken gut microbiome and provide a key resource for future high-resolution taxonomic and functional studies on the chicken gut microbiome.