Sam2rma error when using long contig nucleotide alignments

Hi Daniel,
I recently mapped a few thousand long contigs from an assembly to NCBI nuc using minimap2. I then tried to convert the resulting SAM file to RMA, and encountered the following error:

SAM2RMA6 - Computes a MEGAN RMA (.rma) file from a SAM (.sam) file that was created by DIAMOND or MALT
–in: 5-merged/tick-contigs-rescreen.merged.sam
–reads: 1-fasta-sort/tick-contigs-rescreen.sorted.fasta
–out: 6-rma/tick-contigs-rescreen.nucleotide.readCount.rma
–useCompression: true
–paired: false
–pairedSuffixLength: 0
–longReads: true
–maxMatchesPerRead: 100
–classify: true
–minScore: 50.0
–maxExpected: 0.01
–topPercent: 10.0
–minSupportPercent: 0.05
–minSupport: 0
–minPercentReadCover: 0.0
–minPercentReferenceCover: 0.0
–lcaAlgorithm: longReads
–lcaCoveragePercent: 100.0
–readAssignmentMode: readCount
Classification support:
–mapDB: /home/dportik/programs/megan/db/megan-nucl-map-Jul2020.db
Deprecated classification support:
–parseTaxonNames: true
–firstWordIsAccession: true
–accessionTags: gb| ref|
–threads: 24
–verbose: true
Version MEGAN Community Edition (version 6.19.4, built 16 Jul 2020)
Author(s) Daniel H. Huson
Copyright (C) 2020 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Loading 2,259,889
Loading ncbi.tre: 2,259,893
Current SAM file: 5-merged/tick-contigs-rescreen.merged.sam
Reads file: 1-fasta-sort/tick-contigs-rescreen.sorted.fasta
Output file: 6-rma/tick-contigs-rescreen.nucleotide.readCount.rma
Classifications: Taxonomy
Generating RMA6 file Parsing matches
Annotating RMA6 file using FAST mode (accession database and first accession per line)
Parsing file tick-contigs-rescreen.merged.sam
Parsing file: 5-merged/tick-contigs-rescreen.merged.sam
Input domination filter: MinPercentCoverToStronglyDominate=90.0 and TopPercentScoreToStronglyDominate=90.0
10% 20% 30% 40% 50% 60% Caught:
java.lang.NegativeArraySizeException: -1725067332
at megan/megan.parsers.blast.PostProcessMatches.apply(
at megan/
at megan/megan.rma6.RMA6FromBlastCreator.parseFiles(
at megan/
at megan/
at megan/

Do you think this is due to using very large contigs? I do not need the alignments and am considering replacing cigar, seq, and qual fields with a *, and am wondering if this would be a quick fix.


This is due to the fact that MEGAN writes a read and all matches associated with it into a byte array. I use a doubling strategy when growing the array, which I will fix so that the program can assign max size array (but anything larger than that will require a redesign…)

Replacing cigars etc by “*” should work, the code should notice that info for computing the alignments is missing and will not try to report them (thus using less bytes per query)