Wyh is megan not assigning species by top hit

rmaqsood · February 13, 2019, 6:30pm

Hello,

I needed some help in understanding how to set the LCA parameters so that the alignment with the highest bitscore is used to assign the taxonomy. I extracted the reads that I think are not assigned properly and looked through the corresponding blastx output, but I don’t think that the alignment with the best score is being used to parse the data. Any insight into how I could change the parameters to make parsing more accurate would be appreciated! I did try the weighted LCA but got similar result as naive LCA. I will attach the blastx output.

I1159PanineExtractedBlast.out (3.5 MB)

Thanks!

rmaqsood · February 13, 2019, 11:18pm

I think this is due to the accession number in the blast output not being recognized by megan. When I inspect the reads in megan, the top hit, which is what I believe is the true species, has a “?;” and not the species name. I’m not sure why that is happening though or how to fix it.

Daniel · February 19, 2019, 2:46am

Looking at this in the Inspector window, I don’t understand the problem: reads appear to be assigned according to the best alignment:

For example, the first read assigned to Human betaherpesvirus 5 has two alignments, the strongest being to Human betaherpesvirus 5. Given this clear state of affairs, where else do you expect the read to be assigned to?

rmaqsood · February 19, 2019, 5:51pm

Hello,

Thanks for getting back to me. The human betaherpesvirus 5 is the correct species. My problem is that the read assigned to panine betaherpesvirus 2 are not assigned to the human betaherpesvirus 5. If you look at the first alignment under the reads of panine betaherpesvirus, the first one is assigned “?” and thus skipped over to the next best one which is the panine betaherpesvirus 2, although if you take a look a closer under the “?” you see that it is actually human betaherpesvirus 5. I think this was likely that the mapping file was did not have the accession, but I was able to fix this issue by uploading the most recent mapping file.

Rabia