I would like to use Megan on the command line, and I am trying blast2lca in the megan package on bioconda (MEGAN Community Edition version 6.21.7). It was fine when I passed a small blast tab file (about 10000 sequences, 200MB) as a trial. However, when I passed a larger file (about 10M sequences, 12GB), the blast2lca did not finish even after 10 days. I thought maybe I should have specified multi-thread, but I couldn’t find that option in the help message displayed by blast2lca -h.
Can we specify the number of threads for blast2lca? Or is there any way to make the process faster?
The blast2lca program is not parallelized, unfortunately…
Moreover, it is most likely not the program that you want to use. If you want to import data into MEGAN, then the best way to do this is to run DIAMOND, producing a DAA file (specify format 100) and then to meganize that file. Or, if you have a blast file or similar, use either the blast2rma tool or MEGAN, to import the file into RMA format.
I’m sorry I didn’t explain enough about what I wanted to do. I wanted to get a list of taxonomic annotations for sequences (e.g., tab delimited file) without launching GUI app so that I could process them automatically with a shell script.
If there is a better method for such a case, I would appreciate it if you could let me know.