Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 20;7(6):e0059522.
doi: 10.1128/msystems.00595-22. Epub 2022 Nov 30.

Long-Read Sequencing Improves Recovery of Picoeukaryotic Genomes and Zooplankton Marker Genes from Marine Metagenomes

Affiliations

Long-Read Sequencing Improves Recovery of Picoeukaryotic Genomes and Zooplankton Marker Genes from Marine Metagenomes

N V Patin et al. mSystems. .

Abstract

Long-read sequencing offers the potential to improve metagenome assemblies and provide more robust assessments of microbial community composition and function than short-read sequencing. We applied Pacific Biosciences (PacBio) CCS (circular consensus sequencing) HiFi shotgun sequencing to 14 marine water column samples and compared the results with those for short-read metagenomes from the corresponding environmental DNA samples. We found that long-read metagenomes varied widely in quality and biological information. The community compositions of the corresponding long- and short-read metagenomes were frequently dissimilar, suggesting higher stochasticity and/or bias associated with PacBio sequencing. Long reads provided few improvements to the assembly qualities, gene annotations, and prokaryotic metagenome-assembled genome (MAG) binning results. However, only long reads produced high-quality eukaryotic MAGs and contigs containing complete zooplankton marker gene sequences. These results suggest that high-quality long-read metagenomes can improve marine community composition analyses and provide important insight into eukaryotic phyto- and zooplankton genetics, but the benefits may be outweighed by the inconsistent data quality. IMPORTANCE Ocean microbes provide critical ecosystem services, but most remain uncultivated. Their communities can be studied through shotgun metagenomic sequencing and bioinformatic analyses, including binning draft microbial genomes. However, most sequencing to date has been done using short-read technology, which rarely yields genome sequences of key microbes like SAR11. Long-read sequencing can improve metagenome assemblies but is hampered by technological shortcomings and high costs. In this study, we compared long- and short-read sequencing of marine metagenomes. We found a wide range of long-read metagenome qualities and minimal improvements to microbiome analyses. However, long reads generated draft genomes of eukaryotic algal species and provided full-length marker gene sequences of zooplankton species, including krill and copepods. These results suggest that long-read sequencing can provide greater genetic insight into the wide diversity of eukaryotic phyto- and zooplankton that interact as part of and with the marine microbiome.

Keywords: eDNA; long-read sequencing; marine microbiomes; metagenomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

FIG 1
FIG 1
Summary of the 16S and 18S small-subunit (SSU) rRNA genes extracted in four different ways: three assembly sets run against HMMs {long reads assembled with metaFlye (“metaFlye + HMMs”), long and short reads assembled with hybridSPAdes (“hybridSPAdes + HMMs”), and short reads assembled with metaSPAdes [“Illumina (SPAdes + HMMs)”]} as well as short reads run through PhyloFlash [“Illumina (PhyloFlash)”]. (A) Numbers of SSU genes extracted with each approach. One Illumina metagenome failed to assemble, which is why one more sample is included in the PhyloFlash analysis compared to the SPAdes assembly and HMM extraction. (B) Minimum, mean, and maximum lengths of the extracted SSU genes. The dashed lines show the lengths of reference 16S and 18S rRNA genes in all three domains of life. (C) Confidence values assigned to each taxonomic level of each extracted gene, from 0 to 1. Each level of taxonomic resolution featured a wider range of confidence values than the previous level.
FIG 2
FIG 2
Alpha and beta diversity analyses of (i) unassembled short and long reads and (ii) open reading frames (ORFs) extracted from long-read, short-read, and hybrid assemblies. (A) Shannon diversity values calculated from unassembled reads. (B) Principal-component analysis of metagenome taxonomic composition calculated from unassembled reads. (C) Shannon diversity values calculated from assembly ORFs. (D) Principal-component analysis of the metagenome taxonomic composition calculated from assembly ORFs. Metagenome types are denoted by colors, and sample depth groupings are denoted by shapes. Paired long-read, short-read, and hybrid metagenomes from the same eDNA sample are connected with dashed lines in the PCA plots.
FIG 3
FIG 3
Summary of annotated and unannotated open reading frames (ORFs) extracted from four assembly types: short read assemblies (“Illumina-SPAdes”), long reads assembled with hifiasm-meta (“Hifiasm-meta”), long reads assembled with metaFlye (“metaFlye”), and short and long reads assembled together (“hybridSPAdes”). (A) Total numbers of ORFs for each assembly type. (B) Percentages of ORFs for each assembly type that were assigned a KEGG gene annotation. (C) Total numbers of annotated ORFs for each assembly type. (D) Average numbers of annotated ORFs per contig for each assembly type.
FIG 4
FIG 4
Visual representation of the picoeukaryotic MAGs and their source metagenomes, before and after manual refinement, generated by anvi’o. The phylogram branches represent the metagenome contigs clustered by tetranucleotide frequency and coverage. On the left, results from the automated binning program show contigs belonging to the initial MAG. On the right, the final MAG contigs following manual refinement are shown. Completion, redundancy, and total MAG size are provided for each plot. The final taxonomic assignment is provided for the refined MAG. (A) Bathycoccus prasinos MAG before and after manual refinement. (B) Ostreococcus lucimarinus MAG before and after manual refinement.
FIG 5
FIG 5
Short-read metagenome coverage for three contigs generated from long-read assemblies. Each contig was annotated as a different zooplankton species according to 18S and 28S rRNA gene sequences located on the contig. Coverage values were normalized to metagenome sizes, and the y axes for all panels within each plot are set to the same scale. (A) Map showing sample collection sites and table with associated bottom depths. (B) The contig annotated as the krill species Euphausia pacifica recruited short reads across the entire contig length from sites A and J. Short reads from all other metagenomes mapped only to the rRNA gene regions. (C) The contig annotated as the copepod species Calanus pacificus recruited short reads across the entire contig length from about half of the sites, while reads from the other half mapped only to the rRNA gene regions. (D) The contig annotated as the copepod species Metridia pacifica recruited short reads across the entire contig length from sites F and L. Short reads from all other metagenomes mapped only to the rRNA gene regions.

Similar articles

Cited by

References

    1. Martinez-Gutierrez CA, Aylward FO. 2022. Evolutionary genomics of marine bacteria and archaea, p 327–354. In Stal LJ, Cretoiu MS (ed), The marine microbiome, 2nd ed. Springer International Publishing, Cham, Switzerland.
    1. Needham DM, Fichot EB, Wang E, Berdjeb L, Cram JA, Fichot CG, Fuhrman JA. 2018. Dynamics and interactions of highly resolved marine plankton via automated high-frequency sampling. ISME J 12:2417–2432. doi:10.1038/s41396-018-0169-y. - DOI - PMC - PubMed
    1. Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, Eisen JA, Wu D, Paulsen I, Nelson KE, Nelson W, Fouts DE, Levy S, Knap AH, Lomas MW, Nealson K, White O, Peterson J, Hoffman J, Parsons R, Baden-Tillson H, Pfannkoch C, Rogers Y-H, Smith HO. 2004. Environmental genome shotgun sequencing of the Sargasso Sea. Science 304:66–74. doi:10.1126/science.1093857. - DOI - PubMed
    1. Zinger L, Amaral-Zettler LA, Fuhrman JA, Horner-Devine MC, Huse SM, Welch DBM, Martiny JBH, Sogin M, Boetius A, Ramette A. 2011. Global patterns of bacterial beta-diversity in seafloor and seawater ecosystems. PLoS One 6:e24570. doi:10.1371/journal.pone.0024570. - DOI - PMC - PubMed
    1. Anderson RE, Reveillaud J, Reddington E, Delmont TO, Eren AM, McDermott JM, Seewald JS, Huber JA. 2017. Genomic variation in microbial populations inhabiting the marine subseafloor at deep-sea hydrothermal vents. Nat Commun 8:1114. doi:10.1038/s41467-017-01228-6. - DOI - PMC - PubMed

Publication types