Hello,
I ran malt with a custom database that was created with the latest megan-nucl-Jan2021.db.zip
file. Almost all of the reads aligned, yet when I ran rma2info
on the resulting rma file, the output was empty. However, If I create a smaller database using a subset of reference sequences and run the same sample, taxonomies are output. For reference, the full database contains all bacterial species and Hg19. What could be causing this difference? Any guidance here is appreciated.
The headers for human reference reads look like this:
The ‘|’ character is the separator, fields are gi | gi# | ref | accession | name
gi|89030144|ref|NT_113911.1
gi|224514656|ref|NT_167229.1
gi|224514661|ref|NT_167233.1
gi|224514635|ref|NT_113950.2
gi|224514652|ref|NT_167225.1
gi|224514650|ref|NT_167223.1
gi|251831106|ref|NC_012920.1
Bacterial reads are similar, but have names:
gi|379725073|ref|NC_016937.1| Francisella tularensis subsp. tularensis TI0902
gi|379716390|ref|NC_016933.1| Francisella tularensis subsp. tularensis TIGB03
gi|384162394|ref|NC_017189.1| Bacillus amyloliquefaciens LL3
gi|384162404|ref|NC_017190.1| Bacillus amyloliquefaciens LL3
Malt output:
Starting file: a/test1.rma 10% 20% 30% 40% 50% 60% 100% (6.2s) Finishing file: a/test1.rma Binning reads: Initializing... Initializing binning... Using Best-Hit algorithm for binning: Taxonomy Binning reads... Binning reads: Analyzing alignments Total reads: 4,668 With hits: 4,668 Alignments: 37,194 Assig. Taxonomy: 0 MinSupport set to: 1 Binning reads: Writing classification tables Numb. Tax. classes: 1 Binning reads: Syncing Class. Taxonomy: 1 Analysis written to file: a/test1.rma Num. of queries: 5000 Aligned queries: 4668 Num. alignments: 37194 Total time: 585s Peak memory: 178.6 of 380G