I’m using MEGAN Community Edition version 6.7.1 built 8 Mar 2017 on OSX 10.12.3. I noticed MEGAN was assigning reads to OTUs even when they had blast hits below my LCA Min Percent Identity threshold value (I imported a blast xml file to the program). When I re-run the LCA analysis on my data only varying the Min Percent Identity value I receive the same taxonomy assignments. Changing other values, such as Min Score, does work and doesn’t include hits with a bitscore below that value. I’m wondering if I’m selecting another parameter which is short-cutting this Min Percent Identity function or if it’s a bug in the version I’m currently using?
I’ve included an example of my LCA parameters below.
Min Score: 160.0
Max Expected: 1.0E-25
Min Percent Identity: 97
Top Percent: 2.0
Min Support Percent: 0.0(off)
Min Support: 1
Min Complexity: 0.0
LCA Algorithm: Naive
the option should be effective.
I just looked into this using data that I imported from a BLAST XML file.
To demonstrate what is supposed to happen, in this example I set the percent identity threshold to 95.
Here is a screen shot of the inspector window.
What you see is that the match header nodes of matches that have less than 95% identity are grayed out and these matches are not used during LCA assignment, whereas the header nodes of matches without least 95% identity are shown in black (they are active during LCA assignment):
Could you please check in the inspector window whether you see the same kind of black vs gray coloring?
If not, please provide a screen shot so that I can try to figure out what is going on.
Thanks for your help. I think the problem may have been that I was re-analyzing a MEGAN .rma file generated with the ultimate edition of MEGAN5 with the community edition of MEGAN6. When I reloaded the original blast xml file using only MEGAN6 I was able to get the expected behavior. The MEGAN5 file never threw any errors and the only aspect I could see an issue with is it’s treatment of the %ID score. Since I believe MEGAN5 doesn’t take into account %ID in making it’s assignments I suppose this isn’t surprising. Sorry to waste your time!
In case you want to see the problem analyzing the MEGAN5 file with MEGAN6 here is a screen shot (hits have <97%ID but are not greyed out and still assigned to the taxa after LCA analysis).