Lots of matches don't have a taxonomy


I used DIAMOND blastp to blast a series of proteins to the nr database and outputs a .daa files.

I then meganizae my .daa file and explore it with MEGAN. Works very smooth.
However, when I inspect a taxa and the proteins hits, there are a lots of matches that don’t have a taxonomy assigned (lines with the following format: “?; score=250.0”). I attach an example below.

It worries me as usually the matches with no taxonomy have a higher score that the match that has been retained (match having a taxonomy with the highest score). Seems like it would be that lots of sequences in the nr database don’t have a taxonomy assigned in the mapping file (I am using megan-map-Jan2021.db).
Is there a way to fix this issue? How do you guys make the mapping file?
Thank you very much


Dear Thomas, this does not look good. Are you using the latest NCBI-nr?
I assume that these are new entries. I will work on providing a new mapping database by the end of September. That should fix the problem.

I think I might have a similar problem. Used MALT in BlastN mode, and the newest mapping file for MEGAN. Not quite to the same extent, but I guess it might be the same problem?

Also results in a lot of unassigned reads, while still mapping to one (or more) of the references, but they just don’t get classified.

I think my problem described in Strange "coincidences" might be related to this, although achieve no taxonomy assignments at all and am using the newest megan-map file.