Accession mapping now uses SQLite database

With release 6.18.0, we have replaced the mechanism used to map accessions to taxonomic and functional classes.
To process DIAMOND or other alignments against the NCBI-nr database, you now download a single mapping file, either
megan-map-Oct2019.db for the Community Edition, or megan-map-Oct2019-ue.db for the Ultimate Edition, and provide this to MEGAN or Meganizer.

This single file is a SQLite database that contains a table called mappings in which rows represent NCBI-nr accessions (the first one on each header line), and columns represent classifications.

The current classifications are Taxonomy, SEED, EGGNOG and INTERPRO2GO. The mapping file for UE also contains KEGG.

Note that EGGNOG, INTERPRO2GO and KEGG are up-to-date, whereas the MEGAN representation of SEED dates back to 2015. We are currently working on updating our version of SEED.

Please contact me with any questions regarding this new feature.

2 Likes

Hi Daniel,
could you give us more information about how these new files correspond to the old files? We have an old taxonomy file that was based on nt and one on prot, but the new is nr. Is prot equivalent to nr, and will nt be included later? We are also interested in the old mapping files, for example for the free version of KEGG, because that’s not included in this new file. Will you make the old files available for download on another page, or will they not be compatible?

Also, will MALT be updated to match this update?

Thanks,
Irina

Dear Irina,

the old files are still on the server. I will add (back) links to them on the next update of the download webpage. I will add the old KEGG mapping to the next release of the db mapping file. Also, I will look into adding the DNA taxonomy mapping to the same db mapping file, or, alternatively, I will create a db mapping file for use with DNA alignments. I will try to get this all done by the end of this week (I was traveling the last three weeks)
Daniel

Dear @Daniel,
Is there a way to get the actual date of the source data you used for creating the current mapping file? I could find the creation date of the mappings (info_string in info table) but not the date of the underlying data used.
Best,
Ralf

I assume that you mean - when was this data downloaded from its website…

That is currently not tracked and reported, but I understand that this is important and will look into providing that information soon.

1 Like

Thank you, that is exactly what I meant

Dear Daniel,

Is there any chance that the mapping database will be updated in the future? Or is there any other way that I could make an updated taxonomy mapping file using the latest NCBI taxonomy data (either in UE or Community edition).

Thanks!

There is a new download page, same URL as megan6, except it ends on megan7.
This contains an update-to-date mapping file.
MEGAN7 is still an alpha release meaning that it hasn’t yet been throughly debugged and tested. It works reasonably well.
However, I will upload a new release next week which will be in pretty good shape.

The new mapping file is not really backward compatible to MEGAN6, because the taxonomic and functional classifications have changed significantly since the last release and MEGAN6 still contains the old classifications.

I am currently writing a completely new user manual for MEGAN7 that will give an update-to-date description of all the features of the program.

Great, thank you for all the information!

Hi Daniel,

thanks for the update. When I try to unzip the megan-nr90-ue-r1.zip, I get:

unzip megan-nr90-ue-r1.zip
Archive: megan-nr90-ue-r1.zip
inflating: megan-nr90-ue-r1.mdb
error: invalid compressed data to inflate

I’ve tried downloading it three times now with two different methods but get the same error message, so I’m guessing it is not a network error. Is it possible, that the file is corrupted? The mapping file for the complete nr database and nr50 seems to be fine.

Cheers,
Nicole

Sorry about that, I have just uploaded a new file… I was able to download and unzip the new file, so should work.