Daa-meganizer was killed during Binning reads Analyzing alignments phase

Mayur · September 11, 2023, 11:15am

I am meganising a daa file and I’m encountering this error tools/daa-meganizer: line 44: 17077 Killed.

Version MEGAN Community Edition (version 6.25.1, built 31 Aug 2023)
Author(s) Daniel H. Huson
Copyright (C) 2023 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Java version: 20.0.1; max memory: 117.2G
Functional classifications to use: EC, EGGNOG, GTDB, INTERPRO2GO, SEED
Loading ncbi.map: 2,396,736
Loading ncbi.tre: 2,396,740
Loading ec.map: 8,200
Loading ec.tre: 8,204
Loading eggnog.map: 30,875
Loading eggnog.tre: 30,986
Loading gtdb.map: 240,103
Loading gtdb.tre: 240,107
Loading interpro2go.map: 14,242
Loading interpro2go.tre: 28,907
Loading seed.map: 961
Loading seed.tre: 962
Meganizing: …/Mayur_Compost_R1_pair.daa
Meganizing init
Annotating DAA file using FAST mode (accession database and first accession per line)
Annotating references
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (218,968.3s)
Writing
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (8,767.2s)
Binning reads Initializing…
Initializing binning…
Using ‘Naive LCA’ algorithm for binning: Taxonomy
Using Best-Hit algorithm for binning: SEED
Using Best-Hit algorithm for binning: EGGNOG
Using ‘Naive LCA’ algorithm for binning: GTDB
Using Best-Hit algorithm for binning: EC
Using Best-Hit algorithm for binning: INTERPRO2GO
Binning reads…
Binning reads Analyzing alignments
10% 20% 30% 40% 50% 60% 70% 80% 90% /home/stud/tgangar/megan/tools/daa-meganizer: line 44: 17077 Killed $java $java_flags --module-path=$modulepath --add-modules=megan megan.tools.DAAMeganizer $options

While installing I gave it 8000 MB as memory then changed the memory as follows by editing vmoptions file. I have set it to 120 GB.

#Enter one VM parameter per line
#For example, to adjust the maximum memory usage to 512 MB, uncomment the following line:
#-Xmx512m
#To include another file, uncomment the following line:
#-include-options [path to other .vmoption file]
-Xmx120000M

I have used this script to run instead of the BASH script described here.

#!/bin/bash
#SBATCH --job-name=meganizer
#SBATCH --account=“biotechnology”
#SBATCH --nodes=5
#SBATCH --ntasks-per-node=8
#SBATCH --time=7-00:00:00
#SBATCH --error=job.%J.err
#SBATCH --output=job.%J.out
#SBATCH --mem=60G

xvfb-run --auto-servernum --server-num=1 /home/stud/tgangar/megan/tools/daa-meganizer -mdb …/megan-map-Feb2022.db -i …/Mayur_Compost_R1_pair.daa

I can’t figure out the problem, should I redownload the mapping file and do all the steps again?

Mayur · September 13, 2023, 8:09am

Now this time the binning has been done but it is still getting killed at certain steps. The error code is 12956. This is my first time using Megan, can anyone give me a hand in this?

At what step the files will be meganized and can be used for Megan?

Daniel · September 13, 2023, 12:31pm

MEGAN is having a hard time with this dataset…
Since the program does not appear to report an “Out of memory” error, I don’t think that giving it 120GB is necessary or a good idea. This is because the program uses additional memory beyond the 120GB (running the SQLITE database happens outside of Java) and so it may be that the program is using too much memory in total. So, please try running with

-Xmx48G

say, and see whether that runs to completion.

BTW: the number 12956 is not an error code but rather it is the process ID.

Mayur · September 17, 2023, 9:46am

Thank you @Daniel for the support and tips, it worked and has completed binning 100%.

Krithika · October 13, 2023, 7:12am

Hi @Daniel,

I’am trying to meganize a .daa file (1.8 GB) and seem to get a java.lang.OutOfMemoryError: Requested array size exceeds VM limit

Version   MEGAN Community Edition (version 6.25.3, built 15 Sep 2023)
Author(s) Daniel H. Huson
Copyright (C) 2023 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Java version: 20.0.2; max memory: 245.1G
Functional classifications to use: EC, EGGNOG, GTDB, INTERPRO2GO, SEED
Loading ncbi.map: 2,396,736
Loading ncbi.tre: 2,396,740
Loading ec.map:     8,200
Loading ec.tre:     8,204
Loading eggnog.map:    30,875
Loading eggnog.tre:    30,986
Loading gtdb.map:   240,103
Loading gtdb.tre:   240,107
Loading interpro2go.map:    14,242
Loading interpro2go.tre:    28,907
Loading seed.map:       961
Loading seed.tre:       962
Meganizing: barcode05_pass.daa
Meganizing init
Annotating DAA file using FAST mode (accession database and first accession per line)
Annotating references
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (74.3s)
Writing
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (1.0s)
Binning reads Initializing...
Initializing binning...
Using 'Interval-Union-LCA' algorithm (51.0 %) for binning: Taxonomy
Using Multi-Gene Best-Hit algorithm for binning: SEED
Using Multi-Gene Best-Hit algorithm for binning: EGGNOG
Using 'Interval-Union-LCA' algorithm (51.0 %) for binning: GTDB
Using Multi-Gene Best-Hit algorithm for binning: EC
Using Multi-Gene Best-Hit algorithm for binning: INTERPRO2GO
Binning reads...
Binning reads Analyzing alignments
Exception in thread "main" java.lang.OutOfMemoryError: Requested array size exceeds VM limit
        at java.base/java.util.Arrays.copyOf(Arrays.java:3512)
        at java.base/java.util.Arrays.copyOf(Arrays.java:3481)
        at java.base/java.util.ArrayList.grow(ArrayList.java:237)
        at java.base/java.util.ArrayList.grow(ArrayList.java:244)
        at java.base/java.util.ArrayList.add(ArrayList.java:454)
        at java.base/java.util.ArrayList.add(ArrayList.java:467)
        at megan/megan.algorithms.AssignmentUsingMultiGeneBestHit.computeAcceptedMatches(AssignmentUsingMultiGeneBestHit.java:144)
        at megan/megan.algorithms.AssignmentUsingMultiGeneBestHit.computeId(AssignmentUsingMultiGeneBestHit.java:71)
        at megan/megan.algorithms.DataProcessor.apply(DataProcessor.java:301)
        at megan/megan.core.Document.processReadHits(Document.java:548)
        at megan/megan.daa.Meganize.apply(Meganize.java:97)
        at megan/megan.tools.DAAMeganizer.run(DAAMeganizer.java:251)
        at megan/megan.tools.DAAMeganizer.main(DAAMeganizer.java:58)

I’am using the long read mode

MEGAN_Community_unix_6_25_2/tools/daa-meganizer -i barcode05_pass.daa -lg -mdb megan-map-Feb2022.db -t 44 -v

I tried changing the -Xmx option in the MEGAN.vmoptions file but the error still persists.
Can you please advise. Thanks!

XHe · October 21, 2023, 4:00pm

I followed the Diamond+Megan long reads pipeline and had a similar problem as you @ Krithika.
I re-run diamond blastx without the parameter --top 10; daa-meganizer --longReads worked for me.

Daniel · October 24, 2023, 6:03pm

I see what is wrong, this is caused by an integer overflow, not by insufficient memory. I have implemented a fix, please try version 6.25.4 once I have uploaded it.

Krithika · October 26, 2023, 10:09am

Thanks @XHe . I had used the defaults.

Thanks @Daniel . I tried version 6.25.4. There was an Exception.

Version   MEGAN Community Edition (version 6.25.4, built 24 Oct 2023)
Author(s) Daniel H. Huson
Copyright (C) 2023 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Java version: 20.0.2; max memory: 97.7G
Functional classifications to use: EC, EGGNOG, GTDB, INTERPRO2GO, SEED
Loading ncbi.map: 2,396,736
Loading ncbi.tre: 2,396,740
Loading ec.map:     8,200
Loading ec.tre:     8,204
Loading eggnog.map:    30,875
Loading eggnog.tre:    30,986
Loading gtdb.map:   240,103
Loading gtdb.tre:   240,107
Loading interpro2go.map:    14,242
Loading interpro2go.tre:    28,907
Loading seed.map:       961
Loading seed.tre:       962
Meganizing: barcode05_pass.daa
Meganizing init
Annotating DAA file using FAST mode (accession database and first accession per line)
Annotating references
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (107.0s)
Writing
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (1.1s)
Binning reads Initializing...
Initializing binning...
Using 'Interval-Union-LCA' algorithm (51.0 %) for binning: Taxonomy
Using Multi-Gene Best-Hit algorithm for binning: SEED
Using Multi-Gene Best-Hit algorithm for binning: EGGNOG
Using 'Interval-Union-LCA' algorithm (51.0 %) for binning: GTDB
Using Multi-Gene Best-Hit algorithm for binning: EC
Using Multi-Gene Best-Hit algorithm for binning: INTERPRO2GO
Binning reads...
Binning reads Analyzing alignments
Caught:
java.util.ConcurrentModificationException
        at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1013)
        at java.base/java.util.ArrayList$Itr.next(ArrayList.java:967)
        at megan/megan.algorithms.AssignmentUsingMultiGeneBestHit.computeAcceptedMatches(AssignmentUsingMultiGeneBestHit.java:154)
        at megan/megan.algorithms.AssignmentUsingMultiGeneBestHit.computeId(AssignmentUsingMultiGeneBestHit.java:71)
        at megan/megan.algorithms.DataProcessor.apply(DataProcessor.java:301)
        at megan/megan.core.Document.processReadHits(Document.java:548)
        at megan/megan.daa.Meganize.apply(Meganize.java:97)
        at megan/megan.tools.DAAMeganizer.run(DAAMeganizer.java:251)
        at megan/megan.tools.DAAMeganizer.main(DAAMeganizer.java:58)
100% (10.5s)
Total reads:                1
Total weight:           1,094
With hits:                   1 
Alignments:                 19
Assig. Taxonomy:             1
Assig. SEED:                 0
Assig. EGGNOG:               0
Assig. GTDB:                 1
Assig. EC:                   0
Assig. INTERPRO2GO:          0
MinSupport set to: 1
Binning reads Applying min-support & disabled filter to Taxonomy...
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (0.3s)
Min-supp. changes:           0
Binning reads Applying min-support & disabled filter to GTDB...
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (0.1s)
Min-supp. changes:           0
Binning reads Writing classification tables
Binning reads Syncing
100% (0.0s)
Class. Taxonomy:             0
Class. SEED:                 0
Class. EGGNOG:               0
Class. GTDB:                 0
Class. EC:                   0
Class. INTERPRO2GO:          0
Total time:  128.7s
Peak memory: 9.3 of 97.7G

XHe · October 26, 2023, 1:04pm

I tried version 6.25.4, and had the similar problem as Krithika when I used daa-meganizer.

Binning reads…
Binning reads Analyzing alignments
Caught:
java.util.ConcurrentModificationException
at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1013)
at java.base/java.util.ArrayList$Itr.next(ArrayList.java:967)
at megan/megan.algorithms.AssignmentUsingMultiGeneBestHit.computeAcceptedMatches(AssignmentUsingMultiGeneBestHit.java:154)
at megan/megan.algorithms.AssignmentUsingMultiGeneBestHit.computeId(AssignmentUsingMultiGeneBestHit.java:71)
at megan/megan.algorithms.DataProcessor.apply(DataProcessor.java:301)
at megan/megan.core.Document.processReadHits(Document.java:548)
at megan/megan.daa.Meganize.apply(Meganize.java:97)
at megan/megan.tools.DAAMeganizer.run(DAAMeganizer.java:251)
at megan/megan.tools.DAAMeganizer.main(DAAMeganizer.java:58)
at megan6u/megan6u.tools.DAAMeganizer.main(Unknown Source)
100% (59.2s)
Total reads: 2
Total weight: 705
With hits: 2
Alignments: 3
Assig. Taxonomy: 1
Assig. SEED: 0
Assig. EGGNOG: 0
Assig. KEGG: 0
Assig. GTDB: 0
Assig. EC: 0
Assig. INTERPRO2GO: 0
MinSupport set to: 1
Binning reads Applying min-support & disabled filter to Taxonomy…
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (0.5s)
Min-supp. changes: 0
Binning reads Writing classification tables
10% 20% 30% 40% 50% 60% 70% 100% (0.1s)
Binning reads Syncing
100% (0.1s)
Class. Taxonomy: 1
Class. SEED: 1
Class. EGGNOG: 1
Class. KEGG: 1
Class. GTDB: 1
Class. EC: 1
Class. INTERPRO2GO: 1
Total time: 744.6s
Peak memory: 17.8 of 107.4G

Daniel · November 8, 2023, 7:51am

Thank you - that is a bug and I will upload a new release today (V6_25_5) in which it is fixed

Krithika · November 10, 2023, 3:12am

Thanks @Daniel. I was able to meganize the .daa file.