Meganizing stuck at Writing 80%

I am trying to meganize my .daa file and it’s always getting stuck at “Writing 80%”.
The .daa file is based on using diamond on prodigal predicted genes. Prodigal was run on a metagenome assembly consisting of ca 60 environmental metagenome samples. The .daa file is 120 GB in size. I am running this using the daa-meganizer tool dedicating the job 256GB RAM. It will get stuck at Writing 80% indefinitely until I cancel the job or it times out on the computer cluster (after 10 days).

Here’s the command I use:

daa-meganizer -i prodigal.daa -mdb megan-map-Feb2022-ue.db -t 16 --only KEGG

I have tried -cs -256000 and -cs -100000 and it still gets stuck at 80% (not sure if that is the issue here).

Here’s the log:

Version MEGAN Ultimate Edition (version 6.24.5, built 13 Nov 2022)
Author(s) Daniel H. Huson
Copyright (C) 2022 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Java version: 18.0.2.1
Functional classifications to use: KEGG
Loading ncbi.map: 2,396,736
Loading ncbi.tre: 2,396,740
Loading kegg.map: 25,480
Loading kegg.tre: 68,226
Meganizing: prodigal.daa
Meganizing init
Annotating DAA file using FAST mode (accession database and first accession per line)
Annotating references
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (442.6s)
Writing
10% 20% 30% 40% 50% 60% 70% 80%

Here is an example of some of the sequences from the prodigal fasta file used with diamond. Each header starts with a > symbol but it doesn’t show up on this message board. I wonder if the issue might be related to the long header names? The header is one line, but shows up as two on this message board as it is too long to show on one line.

k141_0_1 # 1 # 207 # -1 # ID=1_1;partial=11;start_type=Edge;rbs_motif=None;rbs_spacer=None;gc_cont=0.420
AATCATAACGACGGAAGCGCCGGTGATAGTCTTTTTGCCAATTCCTGGACGGTGCCGCCTGTTGAAGAAAGCAACTACTATATCGACCTTCAGATTACACGTGTAGATTCGGATACCGTCGTTAATCATTTGAATAATATGGCTCTCTTTACAACAATCGGCCCGGTCGTGCTGGATAGCATTTCCTGTATAAAAACATTTACATAT
k141_13676352_1 # 3 # 416 # 1 # ID=3_1;partial=11;start_type=Edge;rbs_motif=None;rbs_spacer=None;gc_cont=0.657
GGCGGGACCTTCAGCTTCGACGAAGTGAGAGATCAGATTCGCGAAACGCTGGCAGGCGAGAAGCGGCGAGAAGCGGCCCTCGAAGCGGCCCGAAGCCAGTGGGCCACGCTGGATACAGGCATCTCCCTCGAGGACGCGGCCGAGCGGCTCGGCTGGTCGATCGGCACGGCCGGTCCGTTCAACCGCCGACAGTTTGCAGCCGGACTCGGCCGCAACACCGAAGCCATCGGAGCGGCATTTGCAGCCCCCGTGGGGCAAGCCGTCGGTCCCCTGAACGCGGACGACGCGGTCGTATTTCTGCGGGTGGACGACCGTACACAGGCGAATCCCGAGTTGTTCGTGGCCGTCCGGGAGCAGCTCAGATCGCAGATGCAGATGCAGGCGTCGCAGGCGAACGTCAATAACTGGATCGAG
k141_3419088_1 # 1 # 129 # -1 # ID=4_1;partial=10;start_type=ATG;rbs_motif=GGAG/GAGG;rbs_spacer=5-10bp;gc_cont=0.674
ATGCTGATTCGTGCTCTTGGAGTCGTCGGCGTCGTGAGTCTGGTCACAATGGCCGCAGTCGCCACAGGGCGCGATGGCCTGACAGGACAGGCCCAGCAGGGCCCGGCGTACGACTCCGCTCGCGCCTGG

Do you know why it might get stuck at Writing 80%? Is there something specific happening at this step?

One guess is that you might be out of memory? The Java part of MEGAN uses up to as much memory as you specified during installation, but the SQLITE database access that reads the mapping DB uses additional memory. I’m still working on this, trying to figure out how to best control how much memory is used there. Did you, or could you, check how much memory the program is using in total, when it gets stuck… Thank you

Hi Daniel,

The jobs I dedicated 256 GB RAM used around 110 GB RAM. I have also tried to meganize my prodigal file using a job with 1 TB RAM, but it still gets stuck at “Writing 80%” and eventually times out. That was a few months ago and I am unable to retrieve information on how much memory that job used.