REVAMP: Rapid Exploration and Visualization through an Automated Metabarcoding Pipeline

Sean M. McAllister; Christopher Paight; Emily L. Norton; Matthew P. Galaska

doi:https://doi.org/10.5670/oceanog.2023.231

View Issue TOC
Volume 36, No. 2–3
Pages 114 - 119

SPOTLIGHT • REVAMP: Rapid Exploration and Visualization through an Automated Metabarcoding Pipeline

By Sean M. McAllister , Christopher Paight, Emily L. Norton, and Matthew P. Galaska

Published Online: October 30, 2023
https://doi.org/10.5670/oceanog.2023.231
Full Article: PDF

Export Article Citation: BibTeX | Reference Manager
Share

Article Abstract

The revolution and acceleration in DNA sequencing over the past three decades has driven the development of new biomolecular tools like environmental DNA (eDNA) metabarcoding for characterizing marine biodiversity. In order to operationalize eDNA approaches for routine NOAA observatories, new bioinformatic programs and improved organismal reference barcodes are needed to provide accurate and reliable biological data in a timely manner. To address these needs, we present Rapid Exploration and Visualization through an Automated Metabarcoding Pipeline (REVAMP), which provides streamlined end-to-end data processing from raw reads to data exploration, visualization, and hypothesis generation. One benefit of REVAMP is the ability to iteratively assess marker gene and reference database performance. Here, we used a filtered reference database that only included sequences uploaded prior to specified date cutoffs from 1995 to 2022 to analyze changes in eDNA metabarcoding taxonomic assignments, revealing patterns of uneven improvement in taxonomic assignment depth and accuracy across time, region, and marker sets. This work highlights the need for targeted reference sequencing efforts for key regional taxa and the importance of such efforts for improving eDNA biomonitoring approaches in the future.

Full Text

Advancing Ocean Biomolecular Observatories

Understanding and predicting ocean ecosystem changes, such as those caused by anthropogenic and climactic influences (Huntington et al., 2020), are central to the missions of both NOAA and the NOAA Pacific Marine Environmental Laboratory (PMEL). Over the past 50 years, such changes have been tracked in both the Arctic and the East Pacific Oceans through the PMEL Ecosystems & Fisheries-Oceanography Coordinated Investigations (EcoFOCI) and Ocean Carbon programs, respectively. These programs harness an array of moored instruments, CTD surveys, Biogeochemical-Argo floats, and net tows in order to understand ecosystem change (see Stabeno et al., 2023, and Feely et al., 2023, both in this issue). For the biological component of these observatories, traditional manual techniques are time intensive, expensive, and condition dependent. Thus, they are deployed in limited scope out of necessity, often only focusing on key commercial species, and are unable to provide a holistic view of ecosystem health and food web biodiversity. Over the past decade, the complexity of change in the ocean has been testing the capabilities of our ocean-observing platforms to capture and track ecosystem community dynamics. Therefore, there is an urgent need to scale ocean biodiversity monitoring efforts to better capture ecosystem responses to rapidly changing ocean conditions.

Recent advances in environmental DNA (eDNA) approaches, driven by the sequencing revolution, provide a powerful new suite of biomolecular ocean observations to characterize marine biodiversity (Beng and Corlett, 2020). The ability to detect biota, from microbes to mammals, from a single liter of seawater provides a promising tool for scaling up monitoring efforts. In response, the NOAA ’Omics Working Group developed a cross-NOAA strategic plan for ’omics (NOAA, 2021) that focuses on tool development (Goodwin et al., 2020) to advance and operationalize eDNA approaches for marine biodiversity observations in support of core NOAA mission objectives. This plan highlights the need to enhance eDNA metabarcoding approaches for surveying ecosystem biodiversity. Additionally, a national eDNA strategy is being developed to promote the coordination of eDNA use for management purposes across federal agencies (Kelly et al., 2023). Although tremendous advancements in eDNA metabarcoding have been made since its first marine application (Karsenti et al., 2011), a suite of challenges must be surmounted to effectively deploy this tool.

“Ensuring accurate and robust data processing, analysis, and visualization of captured sequences is fundamental to operationalizing eDNA for ocean biomolecular observations.”

Current Efforts of the PMEL Ocean Molecular Ecology Group

To assess and predict how rapidly changing ocean conditions influence marine ecosystems at scale, the Ocean Molecular Ecology (OME) group at PMEL pairs biomolecular observations with data from EcoFOCI, Carbon, and Earth-Ocean Interaction observing platforms. Such ’omics monitoring efforts (Galaska et al., 2023, in this issue) are important for establishing ecosystem baselines, providing holistic biodiversity monitoring needed to track and predict the effects of changing ocean conditions on species distribution, abundance, and food web dynamics. Metabarcoding, applied to both eDNA and traditional bulk plankton tows, is a critical tool for this effort so that species assemblages can be characterized through the sequencing of select marker genes from a mixed community of organisms.

Improving Biomolecular Approaches to Address Limitations

Recently, substantial advancements have been made in biomolecular approaches to facilitate robust application of eDNA for marine biodiversity observation. Although eDNA metabarcoding provides tremendous promise for scaling up the spatial and temporal resolution of marine surveys, a suite of challenges needs to be addressed to ensure effective implementation.

Over the past decade, NOAA scientists and others have led the development of quantitative eDNA metabarcoding frameworks, improving our ability to delineate signal from noise and generate robust abundance estimates needed to address key NOAA objectives like stock and ecosystem assessments (Shelton et al., 2023). Over the next decade, efforts are needed to improve such mechanistic frameworks by better characterizing both eDNA dynamics (i.e., fate, transport, and species-specific characteristics) and laboratory biases (i.e., subsampling and amplification efficiencies) to more accurately and consistently derive abundance estimates (Rourke et al., 2021; Takahashi et al., 2023). Further, given the effect of methodology on observed biodiversity, the NOAA ’Omics Strategic Plan (NOAA, 2021) highlights the need to harmonize sampling and analysis processes to mitigate inter-lab variability and allow for integrated eDNA assessments. Ongoing intercalibration efforts, with data management guidelines including standards, will allow NOAA to leverage the unique spatial and temporal resolution of marine biodiversity assessments across line offices and ocean basins.

Critical to the efficacy of eDNA approaches is marker gene selection and curation of robust reference databases, allowing for the accurate taxonomic identification of species. Both of these challenges are inherently linked as primer design for target genes relies on available reference data. Over the past three decades, systematic efforts have developed and supported microbial reference databases, with the SILVA small subunit rRNA 16S marker database currently representing over 510,000 unique taxa at 99% sequence similarity (Glöckner et al., 2017). In the past two decades, significant progress has been made with the sequencing of a wide diversity of metazoan life (e.g., Census of Marine Life, Barcode of Life Database) and development of a broad array of marker genes to identify target organisms (e.g., review by Takahashi et al., 2023). However, extensive metazoan reference barcoding efforts have been deployed in only a few regions for a handful of marker genes, and are particularly biased to vertebrates (e.g., Bemis et al., 2023). Despite 243 million sequences available in the National Center for Biotechnology Information (NCBI) nt (nucleotide) database covering ~273,000 metazoan species, only a small fraction of the total diversity of life is represented, and there is still a significant need to improve sequence databases through dedicated reference barcoding efforts.

REVAMP

Ensuring accurate and robust data processing, analysis, and visualization of captured sequences is fundamental to operationalizing eDNA for ocean biomolecular observations. Given that disseminating ’omics data in a timely manner is a primary goal of the NOAA ’Omics Strategic Plan (NOAA, 2021) and central to NOAA’s mission, the development of robust and accurate tools to streamline ’omics data processing and visualization is integral for the routine deployment of eDNA approaches in ocean observatories. To relieve the data analysis bottleneck delaying data dissemination, there have been substantial efforts to streamline and standardize the bioinformatics methods for metabarcoding tools, with successful pipelines developed (e.g., Tourmaline [Thompson et al., 2022], Anacapa [Curd et al., 2019]). Although these tools have greatly improved sequence processing, there remains a need for bioinformatic pipelines to provide fully integrated data exploration, visualization, and hypothesis generation capabilities. To address this challenge, we present Rapid Exploration and Visualization through an Automated Metabarcoding Pipeline (REVAMP), which provides streamlined end-to-end data processing from raw sequencing data files (fastq format) to data visualization. This pipeline rapidly explores and analyzes ecological patterns in metabarcoding data in a reproducible and accurate manner.

The REVAMP repository, with extensive documentation and example files, can be found on GitHub: https://github.com/McAllister-NOAA/REVAMP. A detailed explanation of the REVAMP workflow has been provided with the REVAMP documentation (v.1.0.5 software release available at https://doi.org/10.5281/zenodo.8195015). Briefly, the REVAMP workflow (Figure 1) recovers unique amplicon sequence variants (ASVs) by processing raw reads through Cutadapt (Martin, 2011) and DADA2 (Callahan et al., 2016). Taxonomy is then assigned based on the common ancestor of the best BLASTn hits in the NCBI nt database (Camacho et al., 2009). Alternatively, REVAMP can integrate the output of SILVAngs, which is highly effective for microbial assignments to a curated taxonomy (Glöckner et al., 2017). Data exploration and visualization are an integral part of the REVAMP pipeline, as these play an important role in pattern observation and hypothesis generation for follow-up testing. Integral software for generating the figures produced by REVAMP include KRONA plots (Ondov et al., 2011), phyloseq (McMurdie and Holmes, 2013), and vegan (Oksanen et al., 2020), among others (see GitHub repository).

FIGURE 1. REVAMP workflow from inputs to table and figure products. Independent checkpoints in the pipeline are indicated with dashed red lines. Asterisks indicate locations where optional inputs can be inserted into the workflow. > High res figure

We used REVAMP to conduct data analysis for the eDNA data set collected with EcoFOCI in Alaska and the Arctic (Galaska et al., 2023, in this issue). This data set consisted of 84 samples sequenced for two markers, the universal 16S marker (Parada et al., 2016) and nuclear metazoan 18S marker (Machida and Knowlton, 2012) (median 51.5k raw reads per marker per sample). The run with REVAMP took approximately 3.5 hours on six processors with maximum 55 GB memory (for BLASTn against the full nt database); it generated 985 hierarchically organized figures (plus legends) per marker of marine biodiversity patterns in comparison to ocean condition observations, allowing for rapid exploration of the data. Historically, this kind of extensive data exploration, using the same bioinformatics tools without an appropriate pipeline, can take two to three weeks. Thus, REVAMP has created a reproducible and rapid solution that can be ported to a cloud environment for ease of use and public access in the future.

Assessments of Marker Gene Choice and Taxonomic Resolution Over the Past 30 Years of Sequencing

Although REVAMP provides a critical tool for processing and analyzing metabarcoding data, the accuracy of taxonomic assignments made by this tool are a function of both how comprehensive reference databases are and the appropriateness of marker gene choice. Thus, improving reference databases and primer design will enhance our ability to resolve unique sequences (i.e., ASVs) to the species level. One way to assess how these affect eDNA and metabarcoding resolution and accuracy is to compare the change in the percentage of unique sequences that can be resolved to species level or higher over time. To accomplish this, we analyzed data sets through REVAMP for three commonly used marker gene regions: Machida nuclear 18S (Machida and Knowlton, 2012), MiFish universal teleost 12S (Miya et al., 2015), and Kelly metazoan 16S (Kelly et al., 2016). Each data set was iteratively analyzed using cutoff dates spaced at six-month intervals from 1995 to 2022, with taxonomic assignments made based on a modified reference database that only included sequences uploaded prior to that cutoff date (Figure 2; see https://github.com/McAllister-NOAA/BLAST_dateFiltering). Accuracy was assessed at each time point based on whether or not the taxonomy matched the current (as of 12/2022) assignment for each ASV. Each panel took approximately 18 hours of processing time using six processors.

Trends observed in Figure 2 are a function of reference database completeness and resolution of a given marker set. The addition of novel reference species to the NCBI database is a primary driver of increasing power for species identification (Gold et al., 2021; Keck et al., 2022). However, the addition of novel reference sequences can also lead to declines in species-level resolution if a novel reference sequence is identical to a closely related species (Keck et al., 2022). The Machida 18S marker (Figure 2a,b) and the MiFish 12S marker (Figure 2c) exhibit improving species-level accuracy (purple line) from 1995 through 2019, suggesting that these markers continue to be improved as they gain more complete reference databases. In contrast, the effectiveness of the Kelly 16S marker has decreased since 2010, suggesting that this marker has a high degree of sequence similarity across species and that reference database building efforts will not improve its efficacy (Figure 2d).

FIGURE 2. Assessment of depth and accuracy of amplicon sequence variant (ASV) assignments over time. Iterations of cutoff date, excluding newer reference sequences at six-month intervals, were run for four different data sets: (a) Machida 18S marker gene on the Alaska/Arctic data set (Galaska et al., 2023, in this issue); (b) Machida 18S marker on a data set from the Olympic Coast National Marine Sanctuary (OCNMS) in Washington (Paight et al., 2021); (c) MiFish 12S marker on OCNMS data set; and (d) Kelly 16S marker on OCNMS data set. > High res figure

When considering the regional impact of reference sequencing efforts, we compared the Machida 18S marker between the temperate Pacific and Arctic data sets. Over time, this marker performed comparably between these regions, suggesting that reference barcoding efforts have been similarly effective in both regions (Figure 2a,b). It is clear from these analyses that further examination of these trends by marker and in different regions are warranted so that effective markers can be chosen to maximize resolution and accuracy. Furthermore, this work highlights the importance of regularly reevaluating markers with updated reference databases to ensure appropriate marker gene selection.

Building more thorough and regionally focused reference databases for the Northeast Pacific and the Arctic are primary focal points for the PMEL OME group, in collaboration with other NOAA colleagues and the Smithsonian Institution. This work will improve the results presented in Galaska et al. (2023, in this issue), for which taxonomic assignments can be updated with each new iteration of the reference database.

Conclusions

With a rapidly changing ocean environment, the keys to successful operationalization of an ocean biomolecular monitoring framework are the accuracy and speed of knowledge dissemination. Reproducible bioinformatics tools like REVAMP reduce the time lag between raw data generation and production of biological ecosystem information from weeks to hours. Rapidly exploring patterns in eDNA data across multi-stressor gradients allows us to understand and predict ecosystem dynamics in response to changing ocean conditions, providing a key bioinformatic resource that will better serve NOAA objectives. As the PMEL OME group improves reference databases in the Northeast Pacific and the Arctic, REVAMP can be used to examine the efficacy of reference databases and marker choices. As a flexible pipeline generating hundreds of visualizations, REVAMP is a valuable tool for rapid biodiversity evaluation, and will be incorporated into NOAA ’omics bioinformatics toolkits as we operationalize eDNA efforts.

Acknowledgments

We would like to thank Zachary Gold and two external reviewers for their helpful feedback on the manuscript. We also thank Sarah Battle for her logo design and figure streamlining help. This publication represents NOAA Pacific Marine Environmental Laboratory Contribution No. 5528; also partially funded by the Cooperative Institute for Climate, Ocean, and Ecosystem Studies (CICOES) under NOAA agreement NA20OAR4320271, Contribution No. 2023-1294.

Data Availability

Data availability for the Alaska/Arctic data set is discussed by Galaska et al. (2023, in this issue), with reads available at the NCBI’s Sequence Read Archive (SRA) under BioProject PRJNA982176. Raw sequence reads for the additional data set from the OCNMS were previously deposited under BioSamples SAMN23524382-SAMN23524565. Additional data and analysis products for marker assessment in Figure 2 were deposited on FigShare (https://figshare.com/projects/REVAMP/170406).

Citation

McAllister, S.M., C. Paight, E.L. Norton, and M.P. Galaska. 2023. REVAMP: Rapid Exploration and Visualization through an Automated Metabarcoding Pipeline. Oceanography 36(2–3):114–119, https://doi.org/10.5670/oceanog.2023.231.

References

Bemis, K.E., M.G. Girard, M.D. Santos, K.E. Carpenter, J.R. Deeds, D.E. Pitassy, N.A.L. Flores, E.S. Hunter, A.C. Driskell, K.S. Macdonald III, and others. 2023. Biodiversity of Philippine marine fishes: A DNA barcode reference library based on voucher specimens. Scientific Data 10:411, https://doi.org/10.1038/s41597-023-02306-9.
Beng, K.C., and R.T. Corlett. 2020. Applications of environmental DNA (eDNA) in ecology and conservation: Opportunities, challenges, and prospects. Biodiversity and Conservation 29:2,089–2,121, https://doi.org/10.1007/s10531-020-01980-0.
Callahan, B.J., P.J. McMurdie, M.J. Rosen, A.W. Han, A.J.A. Johnson, and S.P. Holmes. 2016. DADA2: High-resolution sample inference from Illumina amplicon data. Nature Methods 13:581–583, https://doi.org/10.1038/nmeth.3869.
Camacho, C., G. Coulouris, V. Avagyan, N. Ma, J. Papadopoulos, K. Bealer, and T.L. Madden. 2009. BLAST+: Architecture and applications. BMC Bioinformatics 10:421, https://doi.org/10.1186/1471-2105-10-421.
Curd, E.E., Z. Gold, G.S. Kandlikar, J. Gomer, M. Ogden, T. O’Connell, L. Pipes, T.M. Schweizer, L. Rabichow, M. Lin, and others. 2019. Anacapa Toolkit: An environmental DNA toolkit for processing multilocus metabarcode datasets. Methods in Ecology and Evolution 10:1,469–1,475, https://doi.org/10.1111/2041-210X.13214.
Feely, R.A., L.-Q. Jiang, R. Wanninkhof, B.R. Carter, S.R. Alin, N. Bednaršek, and C.E. Cosca. 2023. Acidification of the global surface ocean: What we have learned from observations. Oceanography 36(2–3):120–129, https://doi.org/10.5670/oceanog.2023.222.
Galaska, M.P., S.D. Brown, and S.M. McAllister. 2023. Monitoring biodiversity impacts of a changing Arctic through environmental DNA. Oceanography 36(2–3):109–113, https://doi.org/10.5670/oceanog.2023.221.
Glöckner, F.O., P. Yilmaz, C. Quast, J. Gerken, A. Beccati, A. Ciuprina, G. Bruns, P. Yarza, J. Peplies, R. Westram, and W. Ludwig. 2017. 25 years of serving the community with ribosomal RNA gene reference databases and tools. Journal of Biotechnology 261:169–176, https://doi.org/10.1016/j.jbiotec.2017.06.1198.
Gold, Z., E.E. Curd, K.D. Goodwin, E.S. Choi, B.W. Frable, A.R. Thompson, H.J. Walker Jr., R.S. Burton, D. Kacev, L.D. Martz, and P.H. Barber. 2021. Improving metabarcoding taxonomic assignment: A case study of fishes in a large marine ecosystem. Molecular Ecology Resources 21:2,546–2,564, https://doi.org/10.1111/1755-0998.13450.
Goodwin, K., R. Certner, M. Strom, F. Arzayus, M. Bohan, S. Busch, G. Canonico, S. Cross, J. Davis, K. Egan, and others. 2020. NOAA ’Omics white paper: Informing the NOAA ’Omics Strategy and Implementation Plan, https://doi.org/10.25923/bd7z-zb37.
Huntington, H.P., S.L. Danielson, F.K. Wiese, M. Baker, P. Boveng, J.J. Citta, A. De Robertis, D.M.S. Dickson, E. Farley, J.C. George, and others. 2020. Evidence suggests potential transformation of the Pacific Arctic ecosystem is underway. Nature Climate Change 10:342–348, https://doi.org/10.1038/s41558-020-0695-2.
Karsenti, E., S.G. Acinas, P. Bork, C. Bowler, C. De Vargas, J. Raes, M. Sullivan, D. Arendt, F. Benzoni, J.-M. Claverie, and others. 2011. A holistic approach to marine eco-systems biology. PLOS Biology 9:e1001177, https://doi.org/10.1371/journal.pbio.1001177.
Keck, F., M. Couton, and F. Altermatt. 2022. Navigating the seven challenges of taxonomic reference databases in metabarcoding analyses. Molecular Ecology Resources 23:742–755, https://doi.org/10.1111/1755-0998.13746.
Kelly, R.P., J.L. O’Donnell, N.C. Lowell, A.O. Shelton, J.F. Samhouri, S.M. Hennessey, B.E. Feist, and G.D. Williams. 2016. Genetic signatures of ecological diversity along an urbanization gradient. PeerJ 4:e2444, https://doi.org/10.7717/peerj.2444.
Kelly, R.P., D.M. Lodge, K.N. Lee, S. Theroux, A.J. Sepulveda, C.A. Scholin, J.M. Craine, E.A. Allan, K.M. Nichols, K.M. Parsons, and others. 2023. Toward a national eDNA strategy for the United States. Environmental DNA, https://doi.org/10.1002/edn3.432.
Machida, R.J., and N. Knowlton. 2012. PCR primers for metazoan nuclear 18S and 28S ribosomal DNA sequences. PLOS One 10:e0134314, https://doi.org/10.1371/journal.pone.0046180.
Martin, M. 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17:10–12, https://doi.org/10.14806/ej.17.1.200.
McMurdie, P.J., and S. Holmes. 2013. phyloseq: An R package for reproducible interactive analysis and graphics of microbiome census data. PLOS One 8:e61217, https://doi.org/10.1371/journal.pone.0061217.
Miya, M., Y. Sato, T. Fukunaga, T. Sado, J.Y. Poulsen, K. Sato, T. Minamoto, S. Yamamoto, H. Yamanaka, H. Araki, and others. 2015. MiFish, a set of universal PCR primers for metabarcoding environmental DNA from fishes: Detection of more than 230 subtropical marine species. Royal Society Open Science 2:150088, https://doi.org/10.1098/rsos.150088.
NOAA. 2021. NOAA ’Omics Strategic Plan 2021–2025: Strategic Application of Transformational Tools. National Oceanic and Atmospheric Administration, US Department of Commerce, 14 pp., https://doi.org/10.25923/1c27-w345.
Oksanen, J., G.L. Simpson, F.G. Blanchet, R. Kindt, P. Legendre, P.R. Minchin, R.B. O’Hara, P. Solymos, M.H.H. Stevens, E. Szoecs, and others. 2020. vegan: Community ecology package. R package version 2.5-7, https://CRAN.R-project.org/package=vegan.
Ondov, B.D., N.H. Bergman, and A.M. Phillippy. 2011. Interactive metagenomic visualization in a web browser. BMC Bioinformatics 12:385, https://doi.org/10.1186/1471-2105-12-385.
Paight, C., J. Waddell, and M.P. Galaska. 2021. Utility of occupancy models with environmental DNA (eDNA) from Olympic Coast National Marine Sanctuary. bioRxiv, https://doi.org/10.1101/2021.08.04.455111.
Parada, A.E., D.M. Needham, and J.A. Fuhrman. 2016. Every base matters: Assessing small subunit rRNA primers for marine microbiomes with mock communities, time series and global field samples. Environmental Microbiology 18:1,403–1,414, https://doi.org/10.1111/1462-2920.13023.
Rourke, M.L., A.M. Fowler, J.M. Hughes, M.K. Broadhurst, J.D. DiBattista, S. Fielder, J.W. Walburn, and E.M. Furlan. 2021. Environmental DNA (eDNA) as a tool for assessing fish biomass: A review of approaches and future considerations for resource surveys. Environmental DNA 4:9–33, https://doi.org/10.1002/edn3.185.
Shelton, A.O., Z.J. Gold, A.J. Jensen, E. D’Agnese, E.A. Allan, A. Van Cise, R. Gallego, A. Ramón-Laca, M. Garber-Yonts, K. Parsons, and others. 2023. Toward quantitative metabarcoding. Ecology 104:e3906, https://doi.org/10.1002/ecy.3906.
Stabeno, P.J., S. Bell, C. Berchok, E.D. Cokelet, J. Cross, R.M. McCabe, C.W. Mordy, J. Overland, D. Strausz, M. Sullivan, and H.M. Tabisola. 2023. Long-term biophysical observations and climate impacts in US Arctic marine ecosystems. Oceanography 36(2–3):78–85, https://doi.org/10.5670/oceanog.2023.225.
Takahashi, M., M. Saccò, J.H. Kestel, G. Nester, M.A. Campbell, M. van der Heyde, M.J. Heydenrych, D.J. Juszkiewicz, P. Nevill, K.L. Dawkins, and others. 2023. Aquatic environmental DNA: A review of the macro-organismal biomonitoring revolution. Science of the Total Environment 873:162322, https://doi.org/10.1016/j.scitotenv.2023.162322.
Thompson, L.R., S.R. Anderson, P.A. Den Uyl, N.V. Patin, S.J. Lim, G. Sanderson, and K.D. Goodwin. 2022. Tourmaline: A containerized workflow for rapid and iterable amplicon sequence analysis using QIIME 2 and Snakemake. GigaScience 11:giac066, https://doi.org/10.1093/gigascience/giac066.

Copyright & Usage

This is an open access article made available under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format as long as users cite the materials appropriately, provide a link to the Creative Commons license, and indicate the changes that were made to the original content. Images, animations, videos, or other third-party material used in articles are included in the Creative Commons license unless indicated otherwise in a credit line to the material. If the material is not included in the article’s Creative Commons license, users will need to obtain permission directly from the license holder to reproduce the material.