I am new to MEGAN and may have some basic questions to ask.
I have got the daa. files after blast my results with the CAZy database using DIAMOND. Then I Meganized the daa. files and mapped using the megan-map-Jan2021.db.zip. The results were successfully visualized on MEGAN. But I have two questions:
how can I see the initial blast results?
I understand that MEGAN transforms the initial blast results, but I lose some information of the initial results, saying, from the CAZy database, I could get the information of the total GH, GT families hits. But on MEGAN, I could only see the COG, SEED, EC, and taxo information. I tried to explore that using alignment options. I clicked on one node and show alignment, I saw the GH information. But, I don’t understand what does this number means? Is this the count number? I don’t feel they match the total count number? Please see the attached picture for this problem.
I wondered how MEGAN calculates count numbers?
Do they only calculate the presence/absence of a sequence to a reference database? what if the reference database has the same sequence under the different categories. I mean, for example, one enzyme could participate in different substrate utilization, when this enzyme corresponding coded sequence is present, all different substrate utilization categories will add 1 to the count?
Please let me know if I didn’t make the question sounds clear. Sincerely looking forward to you suggestions for those problems!