NCBI nr mapping file

When are you planning on updating the protein NCBI nr mapping file ?

I’ve been rechecking species matches manually using NCBI protein BLAST and finding species level matches more up-to-date than reported using the current mapping file.

Do I need to purchase the “Ultimate Edition”?

I will do this before the end of the month.

The UE uses the same mapping files, however it is bundled with the tools that are used to create the mapping file:

The program MEGAN6UE/tools/ncbi/make-acc2ncbi creates an initial mapping file from data downloaded from NCBI

The program MEGAN6UE/tools/utils/extend-mapping is then used to extend the mapping file by assuming that all accessions that appear on the same header line belong to the same taxon: they are all assigned to t the LCA of all taxa that can be identified for the line.

To run these programs you need to use a server with 100+GB.

Just uploaded the mapping files and updated the taxonomy to version as of May 2nd, 2017


I really appreciate your quick response. Analyzing metagenomic samples sometimes feels like hitting a moving target. Having the updated taxonomy mapping file is a big help in keeping up with GenBank updates.



likewise, the maintenance of microbiome analysis software is a moving target…

