A workflow for the detection and phylogenetic placement of eukaryotes from metagenomes

Heroen Verbruggen0, Bobbie Shaban0
(0) University of Melbourne

Find me on Tues Nov 24th, 1:40-3pm AEDT in Remo, table 115

Abstract
Microbial eukaryotes (protists) play key roles in global element cycles, ecosystem functioning, food production and disease, but relatively little genome data are available for microbial eukaryotes, limiting our understanding of their biodiversity, evolution and roles in the environment. As has been the case for prokaryotes, metagenomic data could shed some light on this unknown diversity via metagenome-assembled genomes, but most existing tools are geared towards bacteria and do not function well for protists.

We have designed a workflow for automated download of metagenome reads, data cleaning, assembly, computation of contig statistics (k-mer frequencies, read coverage) and metagenome binning. The workflow is implemented in WDL and dependencies packaged in a Singularity container, allowing us to deploy it on a wide variety of platforms. We have also implemented several new methods to extract eukaryote organelle genomes from the assembled data and to place them in the eukaryote tree of life. A large library of publicly available metagenomes were processed, resulting in the detection and phylogenetic placement of a large diversity of eukaryotes, many of which unknown to science. This improves our knowledge of eukaryote evolution and delivers insights into the occurrence of these eukaryotes in natural environments.