I would like to make sure that I understand the way the tool blast2lca well.
Here are my questions:
it uses naive LCA algorithm, right?
the parameters:
–minScore, maxExpected, minPercentIdentity: do I understand right, that these parameters define which classifications to consider: for example, for default values, only the hits with the score >50, evalue <0.01 and pident>0% would be considered.
–topPercent - does it also concern the number of hits to be considered (i.e. if we have 10 hits, it will assign it to the best one automatically) or it is something else?
-in the output, what are the numbers following the taxon levels? I suppose it is some kind of confidence, but I’m not sure and I would like to know what it means exactly.
Thanks a lot for creating all these cool tools and thanks in advance for the answers,
I have a similar question, especially with regards to the output numbers. I first thought that the numbers represent the Score (but the numbers also don’t seem to change if I set --minScore to a higher value). Is this documented somewhere?
I expected that the values behind the taxonomic ranks are the score values, but it seems like they are not filtered out when I set the min score with -ms to 100.
Can anyone explain to me what these values represent? Or how I can get access to the scores (via command line)?
For a given read R and a given taxon T,
the score is the percentage of all alignments for R
that are to a taxon S that is either T, or an ancestor of T.