I wanted to understand the Co-Occurrence analysis in depth.
Does the tool take in to account of the number of reads mapping to the taxa to come to the conclusion if they co exist or do they oppose each other’s presence? Can the analysis be decoded?
Also, I wanted to understand, as we are starting the analysis using the RNASeq data, the reads are mRNAs from microbiome, does the number of reads mapping to the particular species potentially mean they are over represented or just that they are metabolically active. Kindly clarify on the above two doubts.
A graph is setup in which each class is represented by a node.
The edge between two nodes a and b is assigned a number, which depends on the “method” chosen.
For example, Jaccard assigns the Jaccard index ranging from 0, if classes a and b do not co-occur in any sample and 1, if they always appear together. Other methods are Pearsons R and Kendalls Tau.
The calculation has several parameters:
minThreshold: min percentage of sample that a class must attain to be considered “occurring in the sample”.
minProbability: min co-occurrence probability in percent - this is used as a threshold on edge value to decide whether the edge is shown
minPrevalence: minimum number of samples that a class must occur in before it is shown (as a node) in the graph
maxPrevalence: maximum number of samples that a class may occur in and it still be included in the graph
So this really is a visualization technique and its interpretation depends on the method and parameters set.