How to estimate running time for daa-meganizer for long reads

Hi @Anupam,

I am not sure what information you need for the cluster. The cluster is “a heterogeneous cluster suitable for a variety of workloads”. “A low-latency high-performance fabric connecting all nodes and temporary storage.”

Here is the information I required to run the job:
#!/bin/bash
#SBATCH --time=23:59:00
#SBATCH --account=def-mcristes
#SBATCH --mem=502G
#SBATCH --cpus-per-task=32
#SBATCH --job-name=DiamondMegan_part1_3

My daa file is ~12.7GB. MEGAN (v6.25.9) can use 500G of memory.
I have submitted a job to re-run daa-meganizer for up to 72 hours, but the run has not started yet after waiting for one week. I am also worried that I may have the same problem even if the daa-meganizer run for 72 hours.

In November 2023, I had a problem to run daa-meganizer for the daa file generated by diamond with the “–top 10” option, and had no problem to run daa-meganizer for the daa file generated by diamond without the “–top 10” option (i.e., using default parameter for -k).

I just submitted another job to re-run diamond following your suggestion in this post.

Would the difference between “–top 10” and “-k 25” for diamond be just different number of alignments reported in the daa file?

Thanks a lot for your help!