How to run blast2lca in multi-thread, or Is there a way to make it faster?


I would like to use Megan on the command line, and I am trying blast2lca in the megan package on bioconda (MEGAN Community Edition version 6.21.7). It was fine when I passed a small blast tab file (about 10000 sequences, 200MB) as a trial. However, when I passed a larger file (about 10M sequences, 12GB), the blast2lca did not finish even after 10 days. I thought maybe I should have specified multi-thread, but I couldn’t find that option in the help message displayed by blast2lca -h.
Can we specify the number of threads for blast2lca? Or is there any way to make the process faster?

The blast2lca program is not parallelized, unfortunately…
Moreover, it is most likely not the program that you want to use. If you want to import data into MEGAN, then the best way to do this is to run DIAMOND, producing a DAA file (specify format 100) and then to meganize that file. Or, if you have a blast file or similar, use either the blast2rma tool or MEGAN, to import the file into RMA format.

Thank you for your response!

I’m sorry I didn’t explain enough about what I wanted to do. I wanted to get a list of taxonomic annotations for sequences (e.g., tab delimited file) without launching GUI app so that I could process them automatically with a shell script.
If there is a better method for such a case, I would appreciate it if you could let me know.

Then you need to use daa2info or rma2info to extract such information from a meganized DAA file or an RMA file

Sorry for my late reply.
Actually I haven’t tried those yet, but daa2info and rma2info seem to be exactly what I needed!

Thank you.