Extracting Bacteria (taxonomy ID=2)

hughesd · September 22, 2022, 11:07am

We have been using the commands below to extract Bacteria for further functional analysis in Megan. But we find small numbers of eukaryote reads are also extracted. Is this expected?
We overcame this by running further extractions on the “extracted” files. Is there a better way?

select id=2;
select nodes=subTree;
extract what=document file=<filename.rma> data=Taxonomy ids=selected includeCollapsed=true;

eliasb · October 7, 2022, 7:25am

For this purpose, I have used the daa-meganizer tool. You can find it in megan/tools/daa-meganizer.
use the -cf command and provide a .txt file containing the groups to exclude from your analyzses (one group per line), i.e. in your case it would be a text file containing two lines: Eukaryota and Viruses.

Daniel · October 23, 2022, 4:13pm

This shouldn’t happen. Could you let me know which eukaryotic taxa are showing up? I’m wondering whether its eukaryotic taxa for which there are prokaryotic taxa of the same name…

Daniel · October 24, 2022, 5:03am

I recently fixed a bug related to this problem and the fix will be available in the next release (later today).

hughesd · December 14, 2022, 10:18am

We are now using Megan v6.24.11, but extraction still does not work as expected.

Our latest example is a meganised dataset of some 211 million reads, of which 8458992 are “summed” [assigned] to Viridiplantae [taxon=33090].

We then run these commands

select id=33090;
select nodes=subTree;
extract what=document file='…' data=Taxonomy ids=selected includeCollapsed=true;

The root of the output tree has 8458992 “summed” reads, but only 8455712 are Viridiplantae. The others now appear elsewhere, mainly as Bacteria or Opisthokonta.

Grateful for your help and advice.

hughesd · March 3, 2023, 10:25am

We would be very grateful for any suggestions about this problem.

Daniel · March 16, 2023, 5:55pm

To help me debug this, could you please open the inspector window on one of the unexpected taxonomic nodes and copy out the reads and all alignments for a couple of reads, and send that to me.

Extracting Bacteria (taxonomy ID=2)

We have been using the commands below to extract Bacteria for further functional analysis in Megan. But we find small numbers of eukaryote reads are also extracted. Is this expected? We overcame this by running further extractions on the “extracted” files. Is there a better way?

We have been using the commands below to extract Bacteria for further functional analysis in Megan. But we find small numbers of eukaryote reads are also extracted. Is this expected?
We overcame this by running further extractions on the “extracted” files. Is there a better way?