I am very interested in using the software such as Diamond and MEGAN
to process my metagenomics data.
The data is not 16S RNA but all genes.
I wonder if you could help me with the following three questions:
I have paired-end data, is it a good practice to merge them
together before running the blastx via Diamond or run Diamond for each
end read separately?
If running diamond on them separately, what is the best way to feed
the data into daa2rma or daa-meganizer? I notice that there are
options for pair-end data for those two tools but am not sure if I
understand the usage of them, in particular, for the parameter “-ps
For R1 file:
diamond blastx --db nr --query sample1_R1.fq --threads 24 --outfmt 100
For R2 file:
diamond blastx --db nr --query sample1_R2.fq --threads 24 --outfmt 100
Now we have got two DAA files such as sample1.R1.daa and
sample1.R2.daa. What is the appropriate way to provide them to daa2rma
I have tired to convert those two DAA files into BLAST tabular format,
and it occurs to me that the resultant tabular format files will not
distinguish the read 1 and read 2 from the same pair and the two end
reads will be given the exactly the same name in those tabular files.
In turn, I speculate that the DAA files will not distinguish the two
reads from the same pair in terms of the read name.
- Would you recommend using whether daa2rma or daa-meganizer to
process the data before loading the data into MEGAN?