Reads assigned to multiple "leaves" in SEED analysis

keima · November 16, 2016, 11:02am

I’d like to ask if query reads can be assigned to more than one “leaves”, lowest category, in SEED analysis.

I’ve obtained taxonomic and functional annotation from tab delimited output file of blastp against nr, using the mapping files of prot_acc2tax-Aug2016.abin(taxonomy), acc2eggnog-June2016X.abin(eggNOG), gi2kegg-Feb2015X.bin(KEGG), and gi2seed-May2015X.bin(SEED).
Then I got CSV outputs showing read names and their functional assignment of lowest hierarchy. Those were obtained by selecting “Tree->Uncollapse All”, “Select->All Leaves”, “File->Export->CSV Format”, and then, for example, “readName_to_seedName”.
Only in the CSV output of SEED analysis, some read names were found in multiple rows.

For example, these lines were found.
gene300865 "Glutamate formiminotransferase (EC 2.1.2.5)"
gene300865 "5-FCL-like protein"
gene303946 "COG2363"
gene303946 “ThiJ/PfpI family protein”

I wonder if this is normal and how I should deal with those reads.
I’m sorry I don’t understand SEED classification well. I heard that some “leaves” are classified to more than one higher hierarchical categories, but reads can be assigned to more than one “leaves”?

And, could you tell me where I can find table of SEED hierarchical category that MEGAN refer to?

(I appreciate for the new CSV output format readName_to_taxonPathPercent. It really helped me!)

Sina · November 16, 2016, 11:51am

Unfortunately, the SEED hierarchy is not provided as a full table.
The leaves in the MEGAN representation of SEED correspond to SEED functional roles. These are not separate proteins, so a read can quite likely have multiple functional role.
It is hard to explain the meaning of functional role, I will quote the SEED wiki:

Functional role
The concept of functional role is both basic and primitive in the sense that we will not pretend to offer a precise definition. It corresponds roughly to a single logical role that a gene or gene product may play in the operation of a cell.

keima · November 21, 2016, 1:02pm

Dear Sina

Thank you for replying!
I understand that the multiple assignment is possible.

The absence of SEED hierarchy table is truly unfortunate to me…