No taxonomic assignments

HelenH · July 25, 2017, 10:52am

I am an old hand at MEGAN vers 5.9 but I am having a bit of trouble using DIAMOND 0.9.9 and MEGAN 6.8.9.

I have been testing using diamond to do blast against NCBI nr downloaded and formated for diamond. I get nice blast outputs in the default tab format with fasta or fastq, and have got the .daa output for the same files.

I use the GUI for importing blast or meganising the .daa file. The problem is I keep getting no taxonomic assignments and very few functional ones even though I am specifying the mappng files from the download page for the community version. I also made sure I used the correct kegg map file (old version). This happens when I try to use either the .daa file or a blast file… I am blasting 23 million RNA sequences with diamond. The blast will return 72670123 hits. When I import I get even less hits at 5,449,250.

Any idea why this is happening and how I can fix it to get proper assignment like I used to in MEGAN5?

here is the info from the MEGAN log:

Executing: import blastFile=’/group/soil/helen/diamond/AV160testfastq.m8’ fastaFile=’/group/soil/helen/diamond/AV160_L007_MGFiltered_2P.fastq’ meganFile=’/group/soil/helen/diamond/AV160testfastq.rma6’ useCompression=true format=BlastTab mode=BlastX maxMatches=5 minScore=50.0 maxExpected=0.01 minPercentIdentity=0.0 topPercent=10.0 minSupportPercent=0.01 minSupport=1 lcaAlgorithm=naive minPercentReadToCover=0.0 minComplexity=0.0 useIdentityFilter=false readAssignmentMode=readCount fNames=EGGNOG INTERPRO2GO KEGG SEED;
Executing: ‘import’‘blastFile’’=’’/group/soil/helen/diamond/AV160testfastq.m8’‘fastaFile’’=’’/group/soil/helen/diamond/AV160_L007_MGFiltered_2P.fastq’‘meganFile’’=’’/group/soil/helen/diamond/AV160testfastq.rma6’‘useCompression’’=’‘true’‘format’’=’‘BlastTab’‘mode’’=’‘BlastX’‘maxMatches’’=’‘5’‘minScore’’=’‘50.0’‘maxExpected’’=’‘0.01’‘minPercentIdentity’’=’‘0.0’‘topPercent’’=’‘10.0’‘minSupportPercent’’=’‘0.01’‘minSupport’’=’‘1’‘lcaAlgorithm’’=’‘naive’‘minPercentReadToCover’’=’‘0.0’‘minComplexity’’=’‘0.0’‘useIdentityFilter’’=’‘false’‘readAssignmentMode’’=’‘readCount’‘fNames’’=’‘EGGNOG’‘INTERPRO2GO’‘KEGG’‘SEED’;
Classifications: Taxonomy,SEED,EGGNOG,KEGG,INTERPRO2GO
Parsing file: /group/soil/helen/diamond/AV160testfastq.m8
Total reads: 5,449,250
Alignments: 21,222,213
Binning reads…
Using Naive LCA algorithm for binning: Taxonomy
Using Best-Hit algorithm for binning: SEED
Using Best-Hit algorithm for binning: EGGNOG
Using Best-Hit algorithm for binning: KEGG
Using Best-Hit algorithm for binning: INTERPRO2GO
Total reads: 5,449,250
With hits: 5,449,250
Alignments: 21,222,213
Assig. Taxonomy: 0
Assig. SEED: 0
Assig. EGGNOG: 0
Assig. KEGG: 0
Assig. INTERPRO2GO: 0
MinSupport set to: 544
Min-supp. changes: 0
Numb. Tax. classes: 1
Numb. SEED classes: 1
Numb. EGG. classes: 1
Numb. KEGG classes: 1
Numb. INT. classes: 1
Class. Taxonomy: 1
Class. SEED: 1
Class. EGGNOG: 1
Class. KEGG: 1
Class. INTERPRO2GO: 1
Info: Command completed (1427s): ‘import’‘blastFile’’=’’/group/soil/helen/diamond/AV160testfastq.m8’‘fastaFile’’=’’/group/soil/helen/diamond/AV160_L007_MGFi…
Induced tree has 2 of 1,601,131 nodes
Induced tree has 2 of 1,601,131 nodes

internet_nobody · August 1, 2017, 6:05pm

Are you using the correct taxonomy file - acc vs gi? I was given some old files and couldn’t work out why they wouldn’t assign taxonomy but “my” files worked fine and then realised they’d been produced with an old version of NR that had GI numbers not accession.

HelenH · August 3, 2017, 9:16am

I downloaded the file nucl_acc2tax-May2017.abin.zip Nucleotide accession to NCBI-taxonomy mapping file from the Megan 6 community download file to assign the taxonomy. I RTM and it said that it assigns by default so then theoretically I didn’t need to tell it to use the unzip file above. Either way, the assignment isn’t happening.

I might try the old file: nucl_acc2tax-May2017.abin.zip Nucleotide accession to NCBI-taxonomy mapping file as see if that works?

Daniel · August 3, 2017, 1:43pm

If you are using DIAMOND then you are aligning against proteins, in which case please use

prot_acc2tax_May2017.abin

Did you manually select all the relevant acc2XXX files before attempting to meganize the DAA file? (Your log doesn’t show what happened when you did that). Please make sure that you unzip the accession mapping files before attempting to use them. They are indexed files that won’t work while compressed.