Help with the LCA parameters

Come somebody explain the LCA parameters. Explain please, don’t copy and paste from the manual that I’ve read .
Im working with a DNA control so I know the species that I have. With the version 5 of the program I get all the species if I select a LCA percent of 50 but in this version I don’t now how filter the reads according with the similarity parameter of blast (more 98%).
I have try with the new version V.6 where I have the in % identity but the LCA percent doesn’t appears in this version so Im not be able to arrive to the species level. I have try change all the parameters but nothing…

Any help please

In version 5, we had implemented the LCA of X%. Using that algorithm, a read was placed on the lowest node (“LCA node”) that was above X% of all significant matches for that read.

Version 6 does not contain that algorithm.
Instead, version 6 contains an algorithm that we believe produces more accurate
results, the “weighted LCA”, wLCA.
The wLCA first assigns a weight to each reference sequence, which is given by the number of reads that align only to that reference sequence, or to reference sequences that belong to the same taxonomic species.
Then, the wLCA places each read above 75% of the total weight of all the references to which it has a significant alignment.

Could you please try using the weighted LCA algorithm to see whether this improves over the results obtained by MEGAN 5.

I could look into also supporting the old LCA of X% in MEGAN 6, but I am worried that having two similar, but providing multiple variations of LCA will be confusing for most users.

Dear Daniel,

Thank you very much for your quick reply.
I tried the weighted LCA algorithm, it was one of the first thing that I did, and from 9 species that are in my control-DNA mix I only get one of this species.
By the moment the best results are with the version 5 of the program and as I told with a 50% LCA. In this way I get all the species, then to filter the %similarity I have used the 16S filter (99% for species) . I know that it is for bacterial (Im working with fungi metagenomic, ITS instead 16S) but at least I have this filter for identity.

Best,
Carmen

Dear Carmen,

could you please try using the wLCA with a threshold of 50 %, to see whether you get the same results as with the LCA of 50%?

The percent threshold used by the wLCA is currently not exposed directly in the user interface.
However, you can access it in the Ultimate Edition of MEGAN;
Use the Window-> Command Input menu item
to enter the following command:
setprop WeightedLCAPercent=50;

If changing the percent does prove useful, then I will add it to the user interface.

I have tried the command but Im not sure that the LCA percent has changed…I wrote the command in the command input menu and clicked on apply…in the windows of MEGAN messages appeared

“Executing: setprop WeightedLCAPercent=50;”

but nothing changes and it does not appear that the % is changed

Sorry, I will fix this, for now you have to restart MEGAN for the change to take effect. When you run the wLCA you will see that MEGAN reports the percentage used to the Message Window.

Hi Daniel,

In some cases I have this problem. This OTU is P. palmivora but the program take the first one P. hymalayensis, both have a 100% ident. same score, E-value… How I can solve this? LCA that parameter should I change?

Thank you!

This doesn’t look right… If you can give me access to the data then I will look into this.
(Or is the topPercent option set to 0?)

Hi Daniel

The topPercent option is set in 3.0. It is ok for you share the data through dropbox?

Send me the Dropbox link by email and I will look at the data.
I have identified a subtle bug in the wLCA algorithm that I have fixed and hopefully it fix the problem that you described.

Please download version 6.4.1, this should fix the problem you described.

I downloaded and installed the version 6.4.1. But Im not be able to change the LCA% so I cannot see all the species and I cannot say if the problem is solved. Sorry

I put the data in a dropbox folder if you want have a look

Best,

I will add the percent parameter to the LCA parameters dialog in the next release. Send me the dropbox link by email.

Than you! I sent you the invitation for the folder “DATA” by dropbox

I have uploaded a new version 6.4.2 that allows you to set the percent threshold.
I have looked at your data and using a threshold of 75% looks promising.

Thank you very much!!

I’ll look more closely morning and during the weekend. At the moment I installed the new version. I tried it with 75% but I get better results with lower percentages … Can you tell me for the other parameters, which values you used?

Thank you again

I discovered another subtle bug in the assignment algorithm, which I have now fixed. The analysis of your data now looks absolutely sensible. Please update to 6.4.3 Here are the parameters that I used:

minSupportPercent=0.3 minSupport=1 minScore=170.0 maxExpected=0.0 minPercentIdentity=0.0 topPercent=3.0 weightedLCA=true weightedLCAPercent=50.0 minComplexity=-1.0 pairedReads=false useIdentityFilter=true;

I have check with the new version 6.4.3 and I get 34 Phytophthora sp. I have still an old version from the program in a Virtualbox, MEGAN 5.11.3. If I run the same data in this version I get most of the species, only 9 from the 108 OTUS are classified as Phytophthora sp. So I get better results with the old version

ok, I’ll look into adding the LCA of X% feature into MEGAN 6…

Is there a recommended set of LCA parameters for short reads from shotgun sequencing (50-100 bp)?

I’ve been using the following parameters hoping that this way I can avoid false positive: MinScore=100.0 MaxExpected=0.01 TopPercent=5.0 MinSupport=5 disabledTaxa=12 LCA=naïve