NCBI taxonomy mapping file 2023

Dear Daniel,

approx. when will be available a recent (2023) protein/nucleotide accession to NCBI-taxonomy mapping file?

Thank you in advance for any news:

Best regards: Balázs

Hi Balázs,

If you only need the taxonomy classification, I regularly update the megan-map.db with the prot.accession2taxid file from NCBI using sqlite3 UPDATE for existing accessions and INSERT for new accessions.

Best,
Greg

Hi Greg,

I would really appreciate it if you could provide the updated megan-map.db file, because although it is very cumbersome with the diamond 102 format and Megan custom ncbi.map / ncbi.tre files (after CSV import), and it might not work with the huge tabular files at read level (I’m trying to do it at contig level). Anno I read Daniel’s post on this, but unfortunately I’m not yet familiar with database programming, it would save a lot of time. Oh and yes, I’m only interested in the taxonomy class. part :slight_smile:

Thank you very much: Balázs

1 Like

I’ve spent the whole week on this and wasn’t able to update everything… So now I’m aiming to at least make a new mapping database available for the NCBI taxonomy.

3 Likes

Dear Daniel,

Many thanks in advance!

B

Many thanks! I am looking forward to it as well. I have observed scenarios where reads were assigned to an unexpected species, only to find instances of proteins in the assigned species (via the “Show Alignment” tool) had accession numbers that actually belong to the expected species according to the NCBI database. I was using megan-map-Feb2022.db and a download of the NR database just a couple of weeks ago.

Appreciate the effort!

Dear @Daniel,

I was wondering if there was any update to the taxonomy database? The only one I can find in the downloads is the Feb-2022 one. I’m struggling with a highly diverse rumen community of cows from which the updated db would be extremely valuable.

I appreciate the help.

  • HM

Hi, I am analysing a large number of viral metagenomics datasets and would also really appreciate a updated Megan mapping file for 2024.

Cheers, Anne-Lie

Hi, just adding on here that an update to [megan-nucl-Feb2022.db.zip] would be very helpful!

There is a new download page for MEGAN7, accessible through the same URL as MEGAN6 but ending in megan7. This page contains an up-to-date mapping file. MEGAN7 is still in alpha release, so it hasn’t been thoroughly debugged and tested yet, but it works reasonably well. A new release will be available next week, expected to be in better shape.

The new mapping file isn’t backward compatible with MEGAN6 due to significant changes in taxonomic and functional classifications. We are also writing a new user manual for MEGAN7 to cover all program features comprehensively.

3 Likes

Looking forward to this! I am using the NCBI nt database currently with MEGAN6, do you know if the nt database file will be uploaded to MEGAN7?

Or in the meantime, is there a more updated version of the taxonomy db file for MEGAN6? I’m getting many unassigned taxonomies due to the NCBI taxonomy not lining up with the MEGAN db for later reads. Thanks!

@clalgudi, If you are aligning against a DNA database that uses NCBI accessions, then you can use this MEGAN mapping database present on download page.

megan-DNA-r1.zip

@Anupam Thank you for the quick response! I tried clicking the link to download, but it says:

Not Found
The requested URL was not found on this server.
Apache/2.4.52 (Ubuntu) Server at software-ab.cs.uni-tuebingen.de Port 443

Does the link seem to work for you? Thanks!

@clalgudi, please try it again should work now.

curious to try megan7. cant install it on my ubuntu. running the unix .sh script gives me "gzip: sfx_archive.tar.gz: not in gzip format "

was i supposed to download something else along with the script?

@OmarKR, it works fine for me. Can you try re-downloading and installing it once again? There might have been an error during the download.

just to make sure i understand. I should save the script on my machine, then chmod to give permissions, and just run with bash?

@OmarKR,

In terminal:

wget https://software-ab.cs.uni-tuebingen.de/download/megan7/MEGAN_Community_unix_7_0_9-beta.sh
chmod +x MEGAN_Community_unix_7_0_9-beta.sh
./MEGAN_Community_unix_7_0_9-beta.sh

Then follow the instructions provided.

thank you. ive gotten it running, will test it out

is the mapping file really that big 32gb?
the previous one was just 9gb or so.

@OmarKR , yes mapping NCBI-nr full mapping is big.