Finally, we successfully tested the “Create-accession-db” command (version 6.25.9, built 16 Jan 2024) with a partial NR database file containing only virus sequences, but unfortunately we noticed another issue. Even if we load the newest, alternate taxonomy files (generated with taxdmp2tree tool) into Megan, the Create-accession-db command still insists on the default, older taxonomy data. In the Megan GUI, after the update, the newer versions are shown every time, and the access locations are specified in the .MEGAN.def file.
What could be the solution?
Thanks in advance!
./create-accession-db -c Taxonomy -i accessionmap202402.map -o accessionmap202409-virus.db --nr nr_virus.faa -v --threads 32
Parsing input files: accessionmap202402.map
Processing file: accessionmap202402.map
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (695.3s)
Taxonomy:1,253,086,245 from file: accessionmap202402.map
Loading ncbi.map: 2,396,736
Loading ncbi.tre: 2,396,740
Using LCA for Taxonomy
Processing file: nr_virus.faa
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (61.8s)
Taxonomy: 7,651,841
Creating mappings table: init
Merging all:
Database vacuum
100% (143.0s)
Created 2024-02-19 12:01:07
Taxonomy: Source: accessionmap202402.map
Total time: 912.2s
Peak memory: 173.1 of 193.4G
(Latest NCBI taxonomy stat (contains ~10000 more virus “species” than the default one):
ncbi.map: 2,550,755
ncbi.tre: 2,550,755)