A major obstacle to using next generation sequencing (NGS) technology in clinical routine practice is reliable data analysis. Thousands of sequences need to be aligned and validated, to exclude sequencing artifacts and generate accurate results.
We compared two analysis pipelines for Roche 454 ultra-deep pyrosequencing (UDPS) raw data generated from HIV-1 clinical samples: a commercial and fully automated Web-based software NGS HIV-1 Module (SmartGene, Zug Switzerland) vs the Amplicon Variant Analyzer software (AVA, 454 Life Sciences; Roche). Results were also compared with those obtained with Sanger sequencing.
HIV-1 Reverse Transcriptase and Protease genes from 34 plasma samples were submitted to Sanger sequencing and GS Junior UDPS. Raw UDPS data (sff files) from all samples were analysed with AVA 2.7 software plus manual review of the alignments and the fully automated SmartGene NGS HIV-1 Module prototype (SMG).
Results obtained with both analysis pipelines showed good correlation (85.0%). Divergent results were mainly observed at homo-polymer positions, such as K101, where the frame-aware alignment and error corrections of the automated approach were more efficient and more accurate, both in terms of detecting and quantifying drug resistance mutations.
Our study shows that NGS data can easily be analysed via a fully automated analysis pipeline, here the SmartGene NGS HIV-1 Module, thus minimizing the need for manual review of alignments by the user, otherwise essential to assure accurate results. Such automated analysis pipelines may facilitate the adoption of NGS platforms in the routine clinical laboratory.