Meganize Diamond DAA to RMA files

Hello there,

I am having difficulties meganizing my Diamond DAA files to an RMA file for Megan 6. When I try to parse the RMA file into Megan, everything remained unassigned.

This is the script:

/home/tanshiming/tools/megan6/tools/daa2rma -i ${INPUT}-prodigal.daa -o ${INPUT}-prodigal.rma -fun EGGNOG INTERPRO2GO SEED -g2t /home/tanshiming/tools/megan6/mapping-files/prot_acc2tax-Nov2016.abin -g2eggnog /home/tanshiming/tools/megan6/mapping-files/acc2eggnogg-Oct2016X.abin -g2interpro2go /home/tanshiming/tools/megan6/mapping-files/acc2interpro-Oct2016X.abin -g2seed /home/tanshiming/tools/megan6/mapping-files/acc2interpro-Oct2016X.abin

Can someone guide me on this please?


The commands looks correct…

  • What database did you diamond-blast against? This should work against ncbi-NR

Also, please consider running daa-meganizer rather than daa2rma. daa-meganizer adds some blocks to a daa file so that it can be opened in MEGAN. You can re-meganize such a file whenever you like. This is much faster than running daa2rma and doesn’t create yet another file…

@Daniel, thanks for the reply.

It was diamond-blast against the NCBR-NR database.

Could you please run with -v (verbose) and check that you see this:

–firstWordIsAccession: true

If not, the please explicitly set to true like this:

–firstWordIsAccession true

This needs to be set to “true” of NR released September 2017 or later.

At present this is set to “true” by default in daa-meganizer, but set to “false” for daa2rma
and blast2rma.
Sorry about that, I will fix this in the next update (then the default will be “true”)

i need help in meganizing a file in linux thru commandline. I was using a remote server to run my data and i cannot access the manual or help in the terminal. May i ask for the commandline options especially for meganizing files for linux thru the commandline

If you run megan/tools/daa-meganizer with option -h then you will get this help output:

Meganizer [options]
Prepares (‘meganizes’) a DIAMOND .daa file for use with MEGAN
-i, --in [string(s)] Input DAA file. Mandatory option.
-mdf, --metaDataFile [string(s)] Files containing metadata to be included in files.
-pr, --paired Reads are paired. Default value: false.
-ps, --pairedSuffixLength [number] Length of name suffix used to distinguish between name of read and its mate. Default value: 0.
-lg, --longReads Parse and analyse as long reads. Default value: false.
-class, --classify Run classification algorithm. Default value: true.
-ms, --minScore [number] Min score. Default value: 50.0.
-me, --maxExpected [number] Max expected. Default value: 0.01.
-mpi, --minPercentIdentity [number] Min percent identity. Default value: 0.0.
-top, --topPercent [number] Top percent. Default value: 10.0.
-supp, --minSupportPercent [number] Min support as percent of assigned reads (0==off). Default value: 0.05.
-sup, --minSupport [number] Min support (0==off). Default value: 0.
-mrc, --minPercentReadCover [number] Min percent of read length to be covered by alignments. Default value: 0.0.
-mrefc, --minPercentReferenceCover [number] Min percent of reference length to be covered by alignments. Default value: 0.0.
-alg, --lcaAlgorithm [string] Set the LCA algorithm to use for taxonomic assignment. Default value: naive. Legal values: naive, weighted, longReads
-lcp, --lcaCoveragePercent [number] Set the percent for the LCA to cover. Default value: 100.0.
-ram, --readAssignmentMode [string] Set the read assignment mode. Default value: readCount. Legal values: readCount, readLength, alignedBases, readMagnitude
-cf, --conFile [string] File of contaminant taxa (one Id or name per line).
Classification support:
-tn, --parseTaxonNames Parse taxon names. Default value: true.
-g2t, --gi2taxa [string] GI-to-Taxonomy mapping file.
-a2t, --acc2taxa [string] Accession-to-Taxonomy mapping file.
-s2t, --syn2taxa [string] Synonyms-to-Taxonomy mapping file.
-g2eggnog, --gi2eggnog [string] GI-to-EGGNOG mapping file.
-a2eggnog, --acc2eggnog [string] Accession-to-EGGNOG mapping file.
-s2eggnog, --syn2eggnog [string] Synonyms-to-EGGNOG mapping file.
-t4eggnog, --tags4eggnog [string] Tags for EGGNOG id parsing (must set to activate id parsing).
-g2interpro2go, --gi2interpro2go [string] GI-to-INTERPRO2GO mapping file.
-a2interpro2go, --acc2interpro2go [string] Accession-to-INTERPRO2GO mapping file.
-s2interpro2go, --syn2interpro2go [string] Synonyms-to-INTERPRO2GO mapping file.
-t4interpro2go, --tags4interpro2go [string] Tags for INTERPRO2GO id parsing (must set to activate id parsing).
-g2kegg, --gi2kegg [string] GI-to-KEGG mapping file.
-a2kegg, --acc2kegg [string] Accession-to-KEGG mapping file.
-s2kegg, --syn2kegg [string] Synonyms-to-KEGG mapping file.
-t4kegg, --tags4kegg [string] Tags for KEGG id parsing (must set to activate id parsing).
-g2seed, --gi2seed [string] GI-to-SEED mapping file.
-a2seed, --acc2seed [string] Accession-to-SEED mapping file.
-s2seed, --syn2seed [string] Synonyms-to-SEED mapping file.
-t4seed, --tags4seed [string] Tags for SEED id parsing (must set to activate id parsing).
-fwa, --firstWordIsAccession First word in reference header is accession number (set to ‘true’ for NCBI-nr downloaded Sep 2016 or later). Default value: true.
-atags, --accessionTags [string(s)] List of accession tags. Default value(s): gb| ref|.
-v, --verbose Echo commandline options and be verbose. Default value: false.
-h, --help Show program usage and quit.

There are three crucial things to specify:

  1. the input file
  2. the mapping files used to map NCBI accessions to taxa or functional classes, and
  3. whether you want long read mode (for long reads or contigs) or short read mode.