Filtering meganized DAA files when running comparion

We have a large project with 75 samples run through DIAMOND that we want to use in a comparative analysis. One issue we have is that the DAA files are very large (these are on deep-sequenced host-filtered reads using the ‘-top 5’ flag, but with no compression). The data have also been meganized.

We would like to possibly filter these data more prior to combining and running analysis in MEGAN; would you have any recommendations? Just a note ‘diamond view’ appears like it could be used for at least filtering the original DIAMOND runs (pre-meganized) as it supports ‘-max-target-seqs’ and ‘-top’ but doesn’t appear to currently support DAA output.

looks like Benjamin is working on adding that feature to DIAMOND :wink:

1 Like

Yes, fingers crossed :). I’m guessing we will need to re-meganize after, but will test this out once it’s implemented.