I wonder if it is possible to easily extract in MEGAN combined taxonomic and functional information. In other words is it possible to extract e.g. KEGG KOs identified in a sample together with the corresponding taxa it belongs (e.g. NCBI).
Right now, I can obtain the relative abundance at the community level, but I wonder whether it is also possible to obtain the contribution for KEGG KOs from known and unknown species?
For example, in HUMAnN you get gene family information that also contains for each functional unit details about the taxa it can be assigned to:
see HUMAnN gene families output file. Example:
# Gene Family $SAMPLENAME_Abundance-RPKs UNMAPPED 187.0 UniRef50_unknown 150.0 UniRef50_unknown|g__Bacteroides.s__Bacteroides_fragilis 150.0 UniRef50_A6L0N6: Conserved protein found in conjugate transposon 67.0 UniRef50_A6L0N6: Conserved protein found in conjugate transposon|g__Bacteroides.s__Bacteroides_fragilis 57.0 UniRef50_A6L0N6: Conserved protein found in conjugate transposon|g__Bacteroides.s__Bacteroides_finegoldii 5.0 UniRef50_A6L0N6: Conserved protein found in conjugate transposon|g__Bacteroides.s__Bacteroides_stercoris 4.0 UniRef50_A6L0N6: Conserved protein found in conjugate transposon|unclassified 1.0 UniRef50_O83668: Fructose-bisphosphate aldolase 60.0 UniRef50_O83668: Fructose-bisphosphate aldolase|g__Bacteroides.s__Bacteroides_vulgatus 31.0 UniRef50_O83668: Fructose-bisphosphate aldolase|g__Bacteroides.s__Bacteroides_thetaiotaomicron 22.0 UniRef50_O83668: Fructose-bisphosphate aldolase|g__Bacteroides.s__Bacteroides_stercoris 7.0
- This file details the abundance of each gene family in the community. Gene families are groups of evolutionarily-related protein-coding sequences that often perform similar functions.
- Gene family abundance at the community level is stratified to show the contributions from known and unknown species. Individual species’ abundance contributions sum to the community total abundance.
Thank you! Best regards, Bernhard