Hi @megha
I’ve been working on this, and here are the results
There are two ways to use the SILVA database:
- Using SILVA Taxa IDs or SILVA MEGAN files:
SILVA provides a tree, map file, and accession-to-tax-ID mapping for both LSU and SSU, which are compatible with MEGAN. In other words, SILVA itself provides files that are compatible with MEGAN, so thanks to them for this support!
To import the tree into MEGAN, go to Edit > Preferences > Use Alternative Taxonomy and select the SILVA tree file. The number of accessions in the accession-to-tax ID map matches the number of entries in the SILVA FASTA files. For example, for the LSU dataset, the accession count matches the entries in the SILVA_138.2_LSURef_NR99_tax_silva.fasta.gz
file, making it easy to align and import.
Here are the files used:
To use these files, you need to uncompress the tree and map files. Here’s the command I used on macOS:
gzcat tax_slv_lsu_138.2.tre.gz > tax_slv_lsu_138.2.tre
gzcat tax_slv_lsu_138.2.map.gz > tax_slv_lsu_138.2.map
- Mapping to NCBI Tree in MEGAN:
SILVA provides alternative names for NCBI taxonomy, which differ from SILVA’s taxonomy. We can use the taxmap_slv_lsu_ref_nr_138.2.txt.gz
file for this purpose. But the last column contains SILVA’s taxa ID. But, the taxmap_embl-ebi_ena_lsu_ref_nr99_138.2.txt.gz
file contains the taxonomic path assigned by the original submitter and the NCBI taxonomy ID. This ID is extracted from the source feature of an EMBL entry and matches the tree used in MEGAN.
I have generated this file and am awaiting Prof. Huson’s assessment. Once approved, you can use it if you prefer to work with NCBI names. Otherwise, you can follow the approach mentioned above.
Release 128
is no longer supported by us, as it’s quite outdated. The updated files will be available on the MEGAN7 download page by this week.
Please feel free to let me know if you need any further help or additional information.
Best regards,
Anupam