Blast2rma vs daa-meganizer vs daa2rma

gregorykrice · March 19, 2021, 8:04pm

Hi Daniel,

I use diamond blastx on our interleaved paired sequencing reads to produce a blast tab (output format 6) file and then use blast2rma to produce the rma6 file. Could you describe the advantages/differences of using:

diamond blastx --outfmt 100 --> daa-meganizer

or

diamond blastx --outfmt 100 --> daa2rma

daa-meganizer doesn’t take the reads as an argument. Does that mean that you can’t directly extract the classified reads from within MEGAN?

Thanks,
Greg

Daniel · April 28, 2021, 8:32am

Dear Greg,

please use daa-meganizer, it is much faster than daa2rma.

The difference:

daa-meganizer processes the DAA file and puts the result of the processing at the end of the DAA file. So, no new file is generated. The DAA file already contains all aligned reads, that is why you do not specify the reads file when using this.

daa2rma produces a new file in RMA format. This is a slow process (designed for much smaller files than have today). All alignments are extracted from the DAA file and then saved in the RMA file. This process also requires that the reads are available in a separate file (the code doesn’t grab the reads from the DAA file.)

So, use meganization, it is faster, the resulting file (extended DAA file) is much smaller than the RMA file and you can extract reads from it.