Abstract
Background: Implementation of clinical metagenomics and pathogen genomic surveillance can be particularly challenging due to the lack of bioinformatics tools and/or expertise. In order to face this challenge, we have previously developed INSaFLU, a free web-based bioinformatics platform for virus next-generation sequencing data analysis. Here, we considerably expanded its genomic surveillance component and developed a new module (TELEVIR) for metagenomic virus identification. Results: The routine genomic surveillance component was strengthened with new workflows and functionalities, including (i) a reference-based genome assembly pipeline for Oxford Nanopore technologies (ONT) data; (ii) automated SARS-CoV-2 lineage classification; (iii) Nextclade analysis; (iv) Nextstrain phylogeographic and temporal analysis (SARS-CoV-2, human and avian influenza, monkeypox, respiratory syncytial virus (RSV A/B), as well as a “generic” build for other viruses); and (v) algn2pheno for screening mutations of interest. Both INSaFLU pipelines for reference-based consensus generation (Illumina and ONT) were benchmarked against commonly used command line bioinformatics workflows for SARS-CoV-2, and an INSaFLU snakemake version was released. In parallel, a new module (TELEVIR) for virus detection was developed, after extensive benchmarking of state-of-the-art metagenomics software and following up-to-date recommendations and practices in the field. TELEVIR allows running complex workflows, covering several combinations of steps (e.g., with/without viral enrichment or host depletion), classification software (e.g., Kaiju, Kraken2, Centrifuge, FastViromeExplorer), and databases (RefSeq viral genome, Virosaurus, etc.), while culminating in user- and diagnosis-oriented reports. Finally, to potentiate real-time virus detection during ONT runs, we developed findONTime, a tool aimed at reducing costs and the time between sample reception and diagnosis. Conclusions: The accessibility, versatility, and functionality of INSaFLU-TELEVIR are expected to supply public and animal health laboratories and researchers with a user-oriented and pan-viral bioinformatics framework that promotes a strengthened and timely viral metagenomic detection and routine genomics surveillance. INSaFLU-TELEVIR is compatible with Illumina, Ion Torrent, and ONT data and is freely available at https://insaflu.insa.pt/ (online tool) and https://github.com/INSaFLU (code).
Original language | English |
---|---|
Article number | 61 |
Pages (from-to) | 61 |
Journal | Genome Medicine |
Volume | 16 |
Issue number | 1 |
DOIs | |
Publication status | Published - 25 Apr 2024 |
Bibliographical note
© 2024. The Author(s).Funding
We thank the European Society for Clinical Virology (ESCV) Network on NGS Clinical Virology (ENNGS) for releasing benchmark datasets that were used in this study. We thank Dr. Joaquin Prada (University of Surrey), Dr. Guido Cordoni (University of Surrey), Dr. Adriano Di Pasquale (IZSAM), Dr. Nicolas Radomski (IZSAM), Dr. Alessio Lorusso (IZSAM), Dr. Cesare Camm\u00E0 (IZSAM), Dr, Sabrina Canziani (IZSLER), Miss Doriana Flores (ANSES), Dr. Pilar Aguilera-Sep\u00FAlveda (INIA), Dr. Irene Aldea (INIA), Dr. Iwona Kozyra (PIWET), Dr. Anna Fomsgaard (SSI) and Prof. Anders Fomsgaard (SSI) and all other TELEVIR participants for the productive discussions throughout TELEVIR development and implementation. We deeply thank the international scientific community for the open and real-time software and data sharing, which allowed us to integrate cutting-edge and state-of-the-art bioinformatics features and resources. Special thanks to the Nextstrain (https://nextstrain.org/) team, for their amazing work in developing the open-source tools for phylogenetic and geotemporal tracking of viral pathogens that could be integrated into INSaFLU-TELEVIR. We finally thank the Infraestrutura Nacional de Computa\u00E7\u00E3o Distribu\u00EDda (INCD) for providing computational resources for INSaFLU-TELEVIR testing. INCD was funded by FCT and FEDER under project 22153-01/SAICT/2016. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of this publication. TELEVIR consortium authors: Laurent Bigarr\u00E9, ANSES, laboratory Ploufragan-Plouzan\u00E9-Niort, 29280 Plouzan\u00E9, France Jovita Fern\u00E1ndez-Pinero, Centro de Investigaci\u00F3n en Sanidad Animal (CISA-INIA), CSIC, 28130 Valdeolmos, Madrid, Spain Ricardo J. Pais, Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal Maurilia Marcacci, Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise (IZSAM), Teramo, Italy Ana Moreno, Istituto Zooprofilattico Sperimentale della Lombardia ed Emilia Romagna (IZSLER), Via A. Bianchi, 9, 25124, Brescia, Italy Tobias Lilja, National Veterinary Institute (NVA), Uppsala, Sweden. \u00D8ivind \u00D8ines, Norwegian Veterinary Institute (NVI), Norway Artur Rze\u017Cutka, Department of Food and Environmental Virology, National Veterinary Research Institute (PIWET), Al. Partyzant\u00F3w 57, 24-100, Pu\u0142awy, Poland Elisabeth Mathijs, Infectious Diseases in Animals, Sciensano, Rue Juliette Wytsmanstraat 14, 1050, Brussels, Belgium. Steven Van Borm, Infectious Diseases in Animals, Sciensano, Rue Juliette Wytsmanstraat 14, 1050, Brussels, Belgium. Morten Rasmussen, Statens Serum Institut, Copenhagen, Denmark. Katja Spiess, Statens Serum Institut, Copenhagen, Denmark. Project name: INSaFLU-TELEVIR Project home page: https://insaflu.insa.pt Operating system(s): Platform independent Programming language: python3.x, django Other requirements: web browser, such as Firefox, Chrome or Safari License: GNU license\u2014GPL 2.0 (GNU General Public License. version 2) (https://opensource.org/licenses/GPL-2.0) Any restrictions to use by non-academics: none This study was partially supported by the TELEVIR project, the European Union\u2019s Horizon 2020 Research and Innovation programme under grant agreement No 773830: One Health European Joint Programme. The improvement of the computational capacity of the online tool and its integration in INSA genomic surveillance workflows was also co-funded by the European Union through the Health Emergency Preparedness and Response (HERA) grant \u201CGrant/2021/PHF/23776\u2033 and the project \u201CSustainable use and integration of enhanced infrastructure into routine genome-based surveillance and outbreak investigation activities in Portugal\u201D ( https://www.insa.min-saude.pt/category/projectos/geneo/ ) on behalf of EU4H programme (EU4H-2022-DGA-MS-IBA-1). The development of the findONTime tool and a few platform updates performed in 2023 were also co-financed through the DURABLE project. The DURABLE project has been co-funded by the European Union, under the EU4Health Programme (EU4H), Project no. 101102733. Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the European Health and Digital Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. IZSLER participation was partially funded by the Italian national Research program no. B93C22001210001: CCM-SURVEID\u2014Studio pilota per la sorveglianza di potenziali minacce da malattie infettive emergenti (EIDs) di origine virale mediante una piattaforma diagnostica basata sul sequenziamento metagenomico di nuova generazione (mNGS). CISA-INIA-CSIC participation was partially funded by MCIN/AEI/10.13039/501100011033 and by the EU \u201CNextGenerationEU\u201D/PRTR\u201D through the Spanish project no. PLEC2021-007968: Development of New Technologies to Track Emerging Infectious Threats in Wildlife and the Environment (NEXTHREAT). Rafael Mamede was supported by the Funda\u00E7\u00E3o para a Ci\u00EAncia e Tecnologia (FCT) (grant 2020.08493.BD). We thank the European Society for Clinical Virology (ESCV) Network on NGS Clinical Virology (ENNGS) for releasing benchmark datasets that were used in this study. We thank Dr. Joaquin Prada (University of Surrey), Dr. Guido Cordoni (University of Surrey), Dr. Adriano Di Pasquale (IZSAM), Dr. Nicolas Radomski (IZSAM), Dr. Alessio Lorusso (IZSAM), Dr. Cesare Camm\u00E0 (IZSAM), Dr, Sabrina Canziani (IZSLER), Miss Doriana Flores (ANSES), Dr. Pilar Aguilera-Sep\u00FAlveda (INIA), Dr. Irene Aldea (INIA), Dr. Iwona Kozyra (PIWET), Dr. Anna Fomsgaard (SSI) and Prof. Anders Fomsgaard (SSI) and all other TELEVIR participants for the productive discussions throughout TELEVIR development and implementation. We deeply thank the international scientific community for the open and real-time software and data sharing, which allowed us to integrate cutting-edge and state-of-the-art bioinformatics features and resources. Special thanks to the Nextstrain ( https://nextstrain.org/ ) team, for their amazing work in developing the open-source tools for phylogenetic and geotemporal tracking of viral pathogens that could be integrated into INSaFLU-TELEVIR. We finally thank the Infraestrutura Nacional de Computa\u00E7\u00E3o Distribu\u00EDda (INCD) for providing computational resources for INSaFLU-TELEVIR testing. INCD was funded by FCT and FEDER under project 22153-01/SAICT/2016. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of this publication.
Funders | Funder number |
---|---|
Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria | |
Statens Serum Institut | |
Indian National Science Academy | |
Statens veterinärmedicinska anstalt | |
University of Surrey | |
Centro de Investigación en Sanidad Animal | |
GNU General Public License | |
Norway Artur Rzeżutka | |
Infraestrutura Nacional de Computação Distribuída | |
European Society for Clinical Virology | |
Consejo Superior de Investigaciones Científicas | |
Portugal Maurilia Marcacci | |
Państwowy Instytut Weterynaryjny - Państwowy Instytut Badawczy | |
Department of Food and Environmental Virology | |
National Institute of Health Doutor Ricardo Jorge | |
Veterinærinstituttets | |
EU4H | 101102733 |
PIWET | 24-100 |
Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise | 25124 |
Horizon 2020 Framework Programme | 773830 |
Italian national Research program | B93C22001210001, MCIN/AEI/10.13039/501100011033 |
Fundação para a Ciência e a Tecnologia | 2020.08493 |
European Commission | PLEC2021-007968 |
UK Research and Innovation | 104691 |
European Regional Development Fund | 22153-01/SAICT/2016 |
Keywords
- COVID-19/virology
- Computational Biology/methods
- Genome, Viral
- Genomics/methods
- High-Throughput Nucleotide Sequencing/methods
- Humans
- Internet
- Metagenomics/methods
- SARS-CoV-2/genetics
- Software