I ran a bunch of ancient metagenome samples through MALT-X as
malt-build -i /. -s Protein -t 20 -d index -c Taxonomy
malt-run -i -d index -m BlastX -o . -a . -f Tab -oa . -t 20
and decided I wanted to look at the SEED classifications in the output. To do this, I uploaded the blast tab-formatted output file from MALT into MEGAN v6.1.8 with the acc2SEED-May2015.abin mapping file. When I compare the taxonomy profiles between the MALT-generated .rma6 files and the MEGAN-analyzed MALT tab-formatted alignment files, I see some big differences.
First, there are 4 samples that have very few assigned reads in the MEGAN-generated profile, but don’t have this problem with the MALT-generated profiles, and also then have very few SEED assignments.
Both MEGAN- and MALT-generated versions of the rma6 files have the same number of reads, but the % assigned in the MEGAN-generated files ranges from 2%-51%, with 86% of assigned reads only at the Kingdom level. The MALT-generated files for those same 4 samples have 99% of the reads assigned, and from 19-46% of assigned reads are only at the Kingdom level.
Second, in the remaining samples I see ~2000 species/sample in the MALT-generated profiles, vs ~300 species/sample in the MEGAN-generated files.
Is anyone able to explain why I see these discrepancies? Is there a way to make the MEGAN-generated profile more similar to the MALT-generated profile, so that the SEED profile is more similar to what it would be if it were part of the MALT-generated file?