Hi Daniel
In the case of multiple hits to a single read, does the RPK option consider only the gene length of the top hit or is there a way to average the gene length, based on all the hits?
Best,
Aditya
Hi Daniel
In the case of multiple hits to a single read, does the RPK option consider only the gene length of the top hit or is there a way to average the gene length, based on all the hits?
Best,
Aditya
The top hit that has an assignment to the given class is used.
Hi Daniel
Thanks for the clarification. There seems to be something off with how MEGAN currently does this computation. I digged a bit deeper. It seems like the number of reads aren’t getting multiplied after the normalisation
Case 1: I inspected a protein family (SEED) which had 1 read assigned to it. The RPK value is currently exported in this case
Case 2: I inspected another protein family (SEED), which had multiple reads assigned to it. In this case, MEGAN chooses only 1 read with the best possible hit and does the RPK calculation. It should have additionally multiplied the total number of reads
Regards,
Aditya