Accession mapping now uses SQLite database

With release 6.18.0, we have replaced the mechanism used to map accessions to taxonomic and functional classes.
To process DIAMOND or other alignments against the NCBI-nr database, you now download a single mapping file, either
megan-map-Oct2019.db for the Community Edition, or megan-map-Oct2019-ue.db for the Ultimate Edition, and provide this to MEGAN or Meganizer.

This single file is a SQLite database that contains a table called mappings in which rows represent NCBI-nr accessions (the first one on each header line), and columns represent classifications.

The current classifications are Taxonomy, SEED, EGGNOG and INTERPRO2GO. The mapping file for UE also contains KEGG.

Note that EGGNOG, INTERPRO2GO and KEGG are up-to-date, whereas the MEGAN representation of SEED dates back to 2015. We are currently working on updating our version of SEED.

Please contact me with any questions regarding this new feature.

2 Likes

Hi Daniel,
could you give us more information about how these new files correspond to the old files? We have an old taxonomy file that was based on nt and one on prot, but the new is nr. Is prot equivalent to nr, and will nt be included later? We are also interested in the old mapping files, for example for the free version of KEGG, because that’s not included in this new file. Will you make the old files available for download on another page, or will they not be compatible?

Also, will MALT be updated to match this update?

Thanks,
Irina

Dear Irina,

the old files are still on the server. I will add (back) links to them on the next update of the download webpage. I will add the old KEGG mapping to the next release of the db mapping file. Also, I will look into adding the DNA taxonomy mapping to the same db mapping file, or, alternatively, I will create a db mapping file for use with DNA alignments. I will try to get this all done by the end of this week (I was traveling the last three weeks)
Daniel

Dear @Daniel,
Is there a way to get the actual date of the source data you used for creating the current mapping file? I could find the creation date of the mappings (info_string in info table) but not the date of the underlying data used.
Best,
Ralf

I assume that you mean - when was this data downloaded from its website…

That is currently not tracked and reported, but I understand that this is important and will look into providing that information soon.

1 Like

Thank you, that is exactly what I meant