Unable to change the default coverage parameter for naive LCA assignment with malt v. 0.4.1

akocher · June 24, 2022, 8:53am

Dear developers,

Following the issue I encountered with malt v. 0.5.* (which is described here: LCA placement failure with Malt v. 0.5.2 and 0.5.3), I tried to switch back to v. 0.4.1.

I encountered a different issue with this version, which is the following:

By default, the LCA placement appears to be made with a “naive algorithm” and 80% “coverage”, as stated by the malt-run log:

Using 'Naive LCA' algorithm (80.0 %) for binning: Taxonomy

If I understand correctly (also from some testing that I have done), this means that if more than 80% of the references that are hit by a read belong to the same taxon, the read will be assigned to that taxon, ignoring the other references that might have been hit. For example, if there are 8 references for Yersinia pestis and only 2 references for Yersinia pseudotuberculosis in the index, a read hitting all of these will be assigned to Y. pestis (even if the hit to Y. pseudotuberculosis is the best hit).

As for the previous issue, I could not find a way to change this behaviour using available commandline options.

I think that this can be problematic for people using customized reference datasets that are uneven. Typically, for datasets comprising many references for target taxa and just a few “outgroups”, unspecific matches might be reported as specific matches to target taxa.

Daniel · October 25, 2022, 12:50pm

I just tested this, appears to work correctly in the new release, but please let me know if there are still problems.

Daniel · October 25, 2022, 3:03pm

Actually, please update to version 0.6.1, were the bug has finally been squashed

akocher · November 2, 2022, 3:40pm

Everything seems to work fine with v. 0.6.1, thanks Daniel!