Hello MEGAN team and community,
I am using MEGAN Community Edition 6.18.4 (January 29, 2020) on Mac OS 10.15.
I am comparing 77 individual rma6 files using the “compare” function. This is a case/control study, so per advice from a previous post on this community, I have created an attribute in the ‘sample viewer’ which codes 0 or 1 for ‘control’ or ‘case’. I next do two things. 1) ‘Color by attribute’ so that I have only two colors for case and control. 2) ‘Sort by attribute’ so that I can view all of the cases next to one another, and the controls next to one another in the ‘bar chart’ view.
I have noticed that this changes my underlying data.
Specifically, if I look at the bar chart of the read counts in a MEGAN file (at a specific node, say “Guillardia”) where I do none of the above attribute work, I see a certain pattern (i.e. sample “01.h38au.” has 77 reads to Guillardia, sample “02.h38au.” has 0 reads to Guillardia etc. I assume these are accurate.
After doing steps 1 and 2 above and viewing a bar chart, I see now that the quantities have been permuted across the 77 samples, e.g. sample “05.h38au” has 77 reads, etc. and sample “01.h38au” no longer has 77 reads.
I have attached a screenshot of the two bar charts.
The top is after doing steps 1 and 2, and then re sorting the rows so that they go in the order they were uploaded, using the function “sort rows” in sample viewer. The bottom is a new Megan session where the exact same data is read in (same order), and immediately the bar chart is produced. No attributes.
I have highlighted that the unique maximum read count (77) is associated with two different samples, depending on which plot you look at. What I expect, is that the plots should look exactly the same aside from different coloring, but they do not.
Thank you anyone who can provide some insight to this problem,
PS: A second example using another node