I’m working with a large data set of full length 16S reads generated by the Oxford nanopore technology. I’m curious as to if MEGAN is a suitable tool to analyze this type of data and if so, if anyone has any advice on the work flow for identification and potentially to determine relative abundance. I manages to find one article where MEGAN was the tool of choice but literature in this regard is quite limited.
Hi Brendon,
we have done quite a bit of work on extending MEGAN to allow processing of Nanopore metagenomic reads (see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5910613/ or https://www.biorxiv.org/content/early/2019/01/04/511683), but we have not yet looked into analyzing Nanopore 16S reads in MEGAN.
This would be my first line of attack: use an alignment tool such as MALT (http://ab.inf.uni-tuebingen.de/data/software/malt/download/welcome.html) or minimap (https://github.com/lh3/minimap2) to align against either the Silva database (https://www.arb-silva.de/download/) or the NCBI 16S database (ftp://ftp.ncbi.nlm.nih.gov/), then import into MEGAN using a top percent of 5, weighted LCA and Use 16S Percent identity Filter.
If you align against the Silva references, then you should the following mapping file: http://ab.inf.uni-tuebingen.de/data/software/megan6/download/SSURef_Nr99_132_tax_silva_to_NCBI_synonyms.map.gz.
If you align against the NCBI 16S references, then use this: http://ab.inf.uni-tuebingen.de/data/software/megan6/download/nucl_acc2tax-Nov2018.abin.zip.
If you do try this and run into any problems/questions, then please let me know
Best wishes
daniel