Java error when using daa-meganizer

Hi,

I’m new to MEGAN and I’m interested in the taxonomic analysis at the moment. I used DIAMOND to align my reads but I have a problem with daa-meganizer. Here is the error message I get:

java.lang.NullPointerException
at java.lang.System.arraycopy(Native Method)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:128)
at megan.daa.io.DAAModifier.appendBlocks(DAAModifier.java:137)
at megan.daa.io.DAAModifier.appendBlocks(DAAModifier.java:155)
at megan.daa.DAAReferencesAnnotator.apply(DAAReferencesAnnotator.java:150)
at megan.daa.Meganize.apply(Meganize.java:66)
at megan.tools.DAAMeganizer.run(DAAMeganizer.java:216)
at megan.tools.DAAMeganizer.main(DAAMeganizer.java:54)

Before using daa-meganiser, I use DIAMOND to align my reads against the NR database. Here is the command I used:

diamond blastx --threads 8 --db nr.fasta --out output.daa --outfmt 100 --verbose --query input.fastq

And here is the daa-meganizer command I tried to launch next:

input="-i 1M_R1.daa"
output="-o 1M_R1.output"
options="-v --parseTaxonNames false"
tax="-a2t /home/ec2-user/tools/softwares/megan-6.7.15/mapping_files/prot_acc2tax-Nov2016.abin"

daa-meganizer $input $options $tax

Did I do something wrong ?

And last question, will the output files generated by daa-meganizer be readable ?

Thank you for your help!

Best,
Thibaut

Hi,
you need to build the diamond index first:

diamond makedb --in nr.faa -d nr

that you later provide to the diamond run:

diamond blasts -d nr -q reads.fna -o matches.m8

hope it helps,

Ania

Hi Ania,

Unfortunately no because the problem does not come from DIAMOND but from the daa-meganiser command.

input="-i reads.daa"
output="-o reads.output"
options="-v --parseTaxonNames false"
tax="-a2t prot_acc2tax-Nov2016.abin"

daa-meganizer $input $options $tax

Here is the complete verbose output:

AAMeganizer - Prepares (‘meganizes’) a DIAMOND .daa file for use with MEGAN
Options:
Files
–in: reads.daa
Reads
–paired: false
–pairedSuffixLength: 0
Parameters
–longReads: false
–classify: true
–minScore: 50.0
–maxExpected: 0.01
–minPercentIdentity: 0.0
–topPercent: 10.0
–minSupportPercent: 0.01
–minSupport: 0
–lcaAlgorithm: Naive
Functional classification:
Classification support:
–parseTaxonNames: false
–acc2taxa: prot_acc2tax-Nov2016.abin
Other:
–firstWordIsAccession: true
–accessionTags: gb| ref|
–verbose: true
Version MEGAN Community Edition (version 6.7.15, built 14 Apr 2017)
Copyright © 2017 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Java version: 1.8.0_121
Loading ncbi.map: 1,562,782
Loading ncbi.tre: 1,562,785
Opening file: prot_acc2tax-Nov2016.abin
Annotating references
10% 20% 30% 40% 50% 60% 70% 80% 90% Caught:
java.lang.NullPointerException
at java.lang.System.arraycopy(Native Method)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:128)
at megan.daa.io.DAAModifier.appendBlocks(DAAModifier.java:137)
at megan.daa.io.DAAModifier.appendBlocks(DAAModifier.java:155)
at megan.daa.DAAReferencesAnnotator.apply(DAAReferencesAnnotator.java:150)
at megan.daa.Meganize.apply(Meganize.java:66)
at megan.tools.DAAMeganizer.run(DAAMeganizer.java:216)
at megan.tools.DAAMeganizer.main(DAAMeganizer.java:54)

Thank you for your time!

Best,
Thibaut

Dear Thibaut,

I’ve taken a look at the part of the code where meganizer throws an exception. It is not clear to me what the problem is. Could you tell me: what is the size of the DAA file? (Is it very big, i.e. more than 50GB, although that shouldn’t be a problem). Is it possible that you ran out of disk space (that could case the null pointer exception)?
When did download the file prot_acc2tax-Nov2016.abin? I recently uploaded a new version of the file because it was causing problems for some users? Unless you downloaded it within the last few days, please try re-downloading the file and then rerunning the program.
If all that fails, then please give me access to a file that exhibits the problem and I will run the program in my debugger.
D

Hi @Daniel,

I will download and index the NR database with DIAMOND again, and download again the prot_acc2tax-Nov2016.abin file too.

Then, I will try to use daa-meganizer after mapping reads with DIAMOND and let you know if there is still a problem.

Thibaut

PS: is it possible to give MEGAN a SAM file from another aligner like bowtie2 for example ?

I don’t think that you need to download the NR database and DIAMOND again. My guess is that it might be the mapping file prot_acc2tax-Nov2016.abin (because the error occurs in a part of the code that deals with the mapping file).

In theory MEGAN should be able to process SAM as generated by Bowtie, but this hasn’t been tested much (because it usually isn’t a good idea to run Bowtie on microbiome data unless the organisms in the sample are very closely related to sequenced reference organisms). If you run into any problems, please let me know.

Hi @Daniel,

I re-downloaded the prot_acc2tax-Nov2016.abin file and re-performed my analysis but I still have the problem with the daa-meganizer command.

Here are the commands I launched:

  1. diamond makedb --in nr.fasta --db nr.fasta
  1. diamond blastx --threads 8 --db nr.fasta.dmnd --out 1M_R1.daa --outfmt 100 --verbose --query 1M_R1.fastq
  1. input="-i 1M_R1.daa"
    output="-o 1M_R1.daa.meganized"
    options="-v --parseTaxonNames false"
    tax="-a2t prot_acc2tax-Nov2016.abin"
    daa-meganizer $input $options $tax

Please find the .daa file outputed by DIAMOND, the DIAMOND log file and the daa-meganizer log file HERE

Thibaut

Hi @Daniel,

I’m still having a problem when I’m using daa-meganizer. Do you think I’m doing something wrong or could it be a bug ?

Please find attached the .daa file outputed by DIAMOND in my previous message.

Let me know if you need something else!

Thank you for you time!

Best,
Thibaut

Dear Thibaut,

Please note that daa-meganizer does not produce a new file, so -o is not a valid option for the program.

Unfortunately, I can’t reproduce the error that you reported; for me, daa-meganizer runs without problems on your dataset. The log file that you sent me indicates that the problem lies with parsing of the mapping file prot_acc2tax-Nov2016.abin.

Can you please double check that your downloaded version of prot_acc2tax-Nov2016.abin is not corrupted,
to do so, please run

md5sum prot_acc2tax-Nov2016.abin

and verify that you get this checksum: af338c5056e1a450ebc9d089f330ca40
If not, please re-download the file and check again. Let me know if you still get a discrepancy.