Daa-meganizer memory issues

I have MEGAN 6 community edition and I’m facing problems with daa-meganizer memory issues here is the output of the error:

Version MEGAN Community Edition (version 6.24.23, built 9 May 2023)
Author(s) Daniel H. Huson
Copyright (C) 2023 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Java version: 20.0.1; max memory: 16G
Functional classifications to use: EC, EGGNOG, GTDB, INTERPRO2GO, SEED
Loading ncbi.map: 2,396,736
Loading ncbi.tre: 2,396,740
Loading ec.map: 8,200
Loading ec.tre: 8,204
Loading eggnog.map: 30,875
Loading eggnog.tre: 30,986
Loading gtdb.map: 240,103
Loading gtdb.tre: 240,107
Loading interpro2go.map: 14,242
Loading interpro2go.tre: 28,907
Loading seed.map: 961
Loading seed.tre: 962
Meganizing: …/Compost_R1_pair.daa
Meganizing init
Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
at megan/megan.daa.DAAReferencesAnnotator.apply(DAAReferencesAnnotator.java:66)
at megan/megan.daa.Meganize.apply(Meganize.java:60)
at megan/megan.tools.DAAMeganizer.run(DAAMeganizer.java:251)
at megan/megan.tools.DAAMeganizer.main(DAAMeganizer.java:58)

In the MEGAN.vmoptions file I have set it to 16G for Java Virtual Machine

…# Enter one VM parameter per line
…# For example, to adjust the maximum memory usage to 512 MB, uncomment the following line:
…# -Xmx512m
…# To include another file, uncomment the following line:
…# -include-options [path to other .vmoption file]
…-Xmx16G

This is my job submission script to SLURM

#!/bin/bash
#SBATCH --job-name=meganizer
#SBATCH --account=“biotechnology”
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=10
#SBATCH --time=5-00:00:00
#SBATCH --error=job.%J.err
#SBATCH --output=job.%J.out
#SBATCH --mem=60G
module load meganizer/24_23
xvfb-run --auto-servernum --server-num=1 /apps/megan_6_24_23/tools/daa-meganizer -mdb …/megan-map-Feb2022.db -i …/Compost_R1_pair.daa

These are the specification of the HPC cluster:

The HPC cluster has 126 number CPU based compute nodes to run serial/parallel workload providing a total of 3024 CPU cores.
2 x Intel Xeon E5-2680v3 12core processor clocking at 2.5GHz
64GB RAM, 56Gbps FDR Infiniband connectivity,
1Gbps Ethernet connectivity for management and monitoring.
CentOS 7.7, Linux Kernel – 3.10.0-1062.el7.x86_64, SLURM scheduler, Intel® Parallel Studio XE 2019 Cluster Edition.

Where I’m allowed to use 120 cores maximum with 15 nodes.

Here is what I did:

  1. Converted a file from NCBI nr.gz database file (146 GB) to nr04062023.dmnd file (273 GB) by Diamond.

diamond makedb --in nr.gz -d /scratch/user1/nr04062023 --taxonmap prot.accession2taxid.gz --taxonnodes nodes.dmp --taxonnames names.dmp

  1. Converted a fastq file (22.7 million reads, 7.41 GB) to .daa file (24.6 GB) by Diamond Blastx.

diamond blastx -d nr04062023 -q Compost_R1_pair.fastq -o Compost_R1_pair.daa -f 100

  1. I’m using daa-meganizer of tools directory to meganize .daa files with megan-map-Feb2022.db (8.6 GB)

xvfb-run --auto-servernum --server-num=1 /apps/megan_6_24_23/tools/daa-meganizer -mdb …/megan-map-Feb2022.db -i …/Compost_R1_pair.daa

Do I need to change the vmoptions file value to more memory and should I do this on another server with higher memory or am I doing something wrong?

Version MEGAN Community Edition (version 6.24.23, built 9 May 2023)
Author(s) Daniel H. Huson
Copyright (C) 2023 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Java version: 20.0.1; max memory: 50G
Functional classifications to use: EC, EGGNOG, GTDB, INTERPRO2GO, SEED
Loading ncbi.map: 2,396,736
Loading ncbi.tre: 2,396,740
Loading ec.map: 8,200
Loading ec.tre: 8,204
Loading eggnog.map: 30,875
Loading eggnog.tre: 30,986
Loading gtdb.map: 240,103
Loading gtdb.tre: 240,107
Loading interpro2go.map: 14,242
Loading interpro2go.tre: 28,907
Loading seed.map: 961
Loading seed.tre: 962
Meganizing: …/Compost_R1_pair.daa
Meganizing init
Annotating DAA file using FAST mode (accession database and first accession per line)
Annotating references
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (13,559.1s)
Writing

It is working. I increased Memory to 50G.

Update: It got killed.

Version MEGAN Community Edition (version 6.24.23, built 9 May 2023)
Author(s) Daniel H. Huson
Copyright (C) 2023 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Java version: 20.0.1; max memory: 50G
Functional classifications to use: EC, EGGNOG, GTDB, INTERPRO2GO, SEED
Loading ncbi.map: 2,396,736
Loading ncbi.tre: 2,396,740
Loading ec.map: 8,200
Loading ec.tre: 8,204
Loading eggnog.map: 30,875
Loading eggnog.tre: 30,986
Loading gtdb.map: 240,103
Loading gtdb.tre: 240,107
Loading interpro2go.map: 14,242
Loading interpro2go.tre: 28,907
Loading seed.map: 961
Loading seed.tre: 962
Meganizing: …/Compost_R1_pair.daa
Meganizing init
Annotating DAA file using FAST mode (accession database and first accession per line)
Annotating references
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (13,559.1s)
Writing
10% 20% 30% 40% 50% 60% 70% 80% /apps/megan_6_24_23/tools/daa-meganizer: line 44: 9038 Killed $java $java_flags --module-path=$modulepath --add-modules=megan megan.tools.DAAMeganizer $options