With release 6.18.0, we have replaced the mechanism used to map accessions to taxonomic and functional classes.
To process DIAMOND or other alignments against the NCBI-nr database, you now download a single mapping file, either megan-map-Oct2019.db for the Community Edition, or megan-map-Oct2019-ue.db for the Ultimate Edition, and provide this to MEGAN or Meganizer.
This single file is a SQLite database that contains a table called mappings in which rows represent NCBI-nr accessions (the first one on each header line), and columns represent classifications.
The current classifications are Taxonomy, SEED, EGGNOG and INTERPRO2GO. The mapping file for UE also contains KEGG.
Note that EGGNOG, INTERPRO2GO and KEGG are up-to-date, whereas the MEGAN representation of SEED dates back to 2015. We are currently working on updating our version of SEED.
Please contact me with any questions regarding this new feature.
Hi Daniel,
could you give us more information about how these new files correspond to the old files? We have an old taxonomy file that was based on nt and one on prot, but the new is nr. Is prot equivalent to nr, and will nt be included later? We are also interested in the old mapping files, for example for the free version of KEGG, because that’s not included in this new file. Will you make the old files available for download on another page, or will they not be compatible?
the old files are still on the server. I will add (back) links to them on the next update of the download webpage. I will add the old KEGG mapping to the next release of the db mapping file. Also, I will look into adding the DNA taxonomy mapping to the same db mapping file, or, alternatively, I will create a db mapping file for use with DNA alignments. I will try to get this all done by the end of this week (I was traveling the last three weeks)
Daniel
Dear @Daniel,
Is there a way to get the actual date of the source data you used for creating the current mapping file? I could find the creation date of the mappings (info_string in info table) but not the date of the underlying data used.
Best,
Ralf