Computing taxonomic profile algorithm details

yesimon · August 12, 2015, 5:52am

In MEGAN we have the option to compute a taxonomic profile at a given taxonomic rank, say the species level. We are given two options to compute this: a Match-based method and a Projection method. How do these methods work? Also, what does the Min Support Percent option do in this case?

Daniel · August 20, 2015, 11:22am

This are unpublished algorithms, developed together with Vincent Moulton at UEA in Norwich. Both are straightforward:
the projection method projects all counts onto a selected taxonomic rank:
counts below the given rank are pushed up the tree onto the selected rank.
E.g., if you selected the Family level and some reads are assigned to some Genus level node, then that count is added to the Family level node have the Genus level node.

The Project method operates as follows:
for each read r, let M be the set of taxa to which it has a significant alignment and let
n be the size of r.
Then, for each taxon on in M, we add 1/n to its count.
In other words, reads are distributed onto the taxa to which they align.