Hi @XHe
I executed daa-meganizer on your file, and it took approximately 16 hours, 59 minutes with default settings. MEGAN was allocated 200GB of memory (It utilized a maximum resident set size or memory of 89.20 GB.). Currently, the total number of alignments is around 12,123,405. While this duration seems reasonable, we can investigate further to understand why it took so long. However, based on the current analysis, you don’t need to allocate an entire day for computation using daa-meganizer.
Version MEGAN Community Edition (version 6.25.9, built 16 Jan 2024)
Author(s) Daniel H. Huson
Copyright (C) 2023. This program comes with ABSOLUTELY NO WARRANTY.
Java version: 20.0.2; max memory: 195.3G
Functional classifications to use: EC, EGGNOG, GTDB, INTERPRO2GO, SEED
Loading ncbi.map: 2,396,736
Loading ncbi.tre: 2,396,740
Loading ec.map: 8,200
Loading ec.tre: 8,204
Loading eggnog.map: 30,875
Loading eggnog.tre: 30,986
Loading gtdb.map: 240,103
Loading gtdb.tre: 240,107
Loading interpro2go.map: 14,242
Loading interpro2go.tre: 28,907
Loading seed.map: 961
Loading seed.tre: 962
Meganizing: final.contigs_2024.part_001.daa
Meganizing init
Annotating DAA file using FAST mode (accession database and first accession per line)
Annotating references
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (140.8s)
Writing
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (1.7s)
Binning reads Initializing...
Initializing binning...
Using 'Interval-Union-LCA' algorithm (51.0 %) for binning: Taxonomy
Using Multi-Gene Best-Hit algorithm for binning: SEED
Using Multi-Gene Best-Hit algorithm for binning: EGGNOG
Using 'Interval-Union-LCA' algorithm (51.0 %) for binning: GTDB
Using Multi-Gene Best-Hit algorithm for binning: EC
Using Multi-Gene Best-Hit algorithm for binning: INTERPRO2GO
Binning reads...
Binning reads Analyzing alignments
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (60,999.8s)
Total reads: 281,440
Total weight: 158,719,797
With hits: 237,223
Alignments: 12,123,405
Assig. Taxonomy: 211,607
Assig. SEED: 6,304
Assig. EGGNOG: 2,591
Assig. GTDB: 12,030
Assig. EC: 25,421
Assig. INTERPRO2GO: 50,645
MinSupport set to: 15871
Binning reads Applying min-support & disabled filter to Taxonomy...
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (0.3s)
Min-supp. changes: 4,944
Binning reads Applying min-support & disabled filter to GTDB...
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (0.4s)
Min-supp. changes: 3,421
Binning reads Writing classification tables
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (0.5s)
Binning reads Syncing
100% (0.0s)
Class. Taxonomy: 925
Class. SEED: 365
Class. EGGNOG: 690
Class. GTDB: 158
Class. EC: 1,849
Class. INTERPRO2GO: 5,842
Total time: 61,150.5s
Peak memory: 64.3 of 195.3G
Command being timed: "./megan/tools/daa-meganizer -i final.contigs_2024.part_001.daa -mdb megan-map-Feb2022.db --longReads"
User time (seconds): 87751.33
System time (seconds): 1879.05
Percent of CPU this job got: 146%
Elapsed (wall clock) time (h:mm:ss or m:ss): 16:59:12
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 89203468
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 3671
Minor (reclaiming a frame) page faults: 795668269
Voluntary context switches: 10753796
Involuntary context switches: 409467
Swaps: 0
File system inputs: 16384
File system outputs: 2500368
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
You may have received an email containing a link to download the MEGANized DAA file. Please let me know if you haven’t received it.
Best regards,
Anupam