Malt build wrong mapping file

ottocla · November 21, 2017, 4:51pm

Hi everyone,
I am new in the Megan community and I have started only recently to use Malt. I am building an index for the nr.gz database. I accidentally used a wrong path for my mapping file and I ended up having the analysis done but the table (the mapping file) was of course not included. I was wondering if there is a way to re-run the analysis by just feeding the correct table on top of the files already created in the first run, without actually restart the pretty long process from scratch. Is there any command that can do that or should I run the malt-build from scratch with the correct mapping file?

Many thanks in advance for your assistance.
best
Claudio

Daniel · November 23, 2017, 4:17pm

Dear Claudio,

first, if you are using the protein nr.gz database, then please use DIAMOND rather than MALT, as it is faster and uses much less memory.

What type of output did you create with MALT? The mapping files are only required if you generate RMA format straight out of MALT. If you didn’t do that, then you are file, because the mapping files are not required until you import the alignment files into MEGAN.
If you do have RMA files, then the work around would be to load each RMA file into MEGAN and then use the File->Export->Matches menu item to write all alignments to a file, and then to reimport that alignment file into MEGAN, specifying the mapping files then.

ottocla · November 23, 2017, 5:29pm

Hi Daniel,

I still haven’t started the actual Malt-run (in BlastX mode), but the idea is indeed to generate straight RMA files to analyse with Megan, in that case I’ll use the work around you suggested. However, I ran yesterday Diamond on a reduced dataset of my reads (vs nr.gz) and it is indeed strikingly faster (from makedb to blastx and daa2rma) and runs smoothly on lower-mem nodes of our cluster. I decided in the first place to use Malt because I have ancient samples and I see that lately it has been used in some aDNA metagenomics projects, either with nr.gz or nt.gz.

I think I’ll end up running first Diamond with nr.gz on all the samples that I have. Just quick questions: is Malt adding a lot more in terms of sensitivity compared with Diamond? Would it be worth maybe to run Malt (again with nr.gz) only on some selected samples after Diamond? (or maybe Malt is more worthwhile with nt.gz, but in that case I’ll have memory limit issues I guess…)

thanks a lot for your help!
Claudio

Daniel · November 23, 2017, 7:00pm

MALT was released before DIAMOND, and it’s BLASTX mode was used in some studies focusing on ancient DNA. MALT is still used for ancient DNA work, but usually only in BLASTN mode.
For BLASTX alignment, DIAMOND works better than MALT, and there is no loss of sensitivity.