I have converted minimap alignments (alignment to nt database) to rma files using the the sam2rma tool. When I view the rma files in the Megan GUI, I am expecting the LCA (default sam2rma settings) to be based on a top percent value of 10 and min score of 50. I am encountering taxonomic classifications that appear to be inconsistent with the LCA settings. You can see an example of this below.
I would expect the classification to have been the LCA of the top taxa (score 1552) and anything within 10% score (> 1397). Instead the read has been classified as a taxon that does not include the taxon of the top hit. I can’t seem to work out the basis on which the LCA seemingly ignored to top hit. The top hit read appears to pass all of the sam2rma settings. I’ve so far only encountered this issue with the top hit taxon (TaxonID: 11086)
Can you please help with this issue?
You are right, this is not what the LCA should do (unless you are using the weighted LCA?) If you can give me access to a small file that exhibits this problem, then I will debug this.
Hi Daniel
I am getting further examples of this misclassification in other microbial taxa. I can provide additional examples of these if you are interested. Again, setting percent cover to 51% seems to correct this, but I’m not sure why.