Weighted LCA parameters % to cover and top %

steff1088 · August 25, 2019, 8:52pm

Hi Everyone,

I was curious about your commonly used weighted LCA parameters, specifically the % to cover (default is 80%) and top % (default is 10%).

Do you usually use the default or after which criteria do you adjust these?

How much biological sense do the default values make - how were this derived?
I am seeing quite some differences of changing these parameters especially in the depth of taxonomic classification of a read. But which depth makes most sense, or how would I know if my read can really be classified to the species level. I am also working with functional genes who really have different taxonomic similarity cutoffs on each taxonomic level…

cheers,
steffen

steff1088 · August 27, 2019, 3:55pm

@Daniel any comments or feedback from the developer team would be extremely interesting!

Thank you!
-steffen

Daniel · September 9, 2019, 12:30pm

Sorry, I was on holiday…
In our recent paper on Nanopore reads we used “50 %” to cover and “10 %” top https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-019-0665-y.

We did a mini-study on which parameters in this paper: https://biologydirect.biomedcentral.com/articles/10.1186/s13062-018-0208-7

The choice of parameters takes careful consideration and will depend on the type of data and type of study (there will always be false positives or false negatives, which do you want to avoid?)