Bash Running scripts for DIAMOND and DAA-MEGANIZER

Here is my script for running DIAMOND on a server, please see DIAMOND help for more details:

if [ $# != “2” ]
echo “Usage: infile outfile”
exit 1
echo “time ~/software/diamond blastx -b 20 -t /dev/shm --db /abprojects/daniel/nr/nr-Jan2017/nr --out $2 --outfmt 100 --query $1”
time ~/software/diamond blastx -b 20 -t /dev/shm --db /abprojects/daniel/nr/nr-Jan2017/nr --out $2 --outfmt 100 --query $1

Here is my script for running DAA-MEGANIZER for shoer reads on a server, please see daa-meganizer -h for more details:

if [ $# != 1 ]
echo “Usage: meganize daa-file"
exit 1
input=”-i $1"
options="-v -fun EGGNOG KEGG INTERPRO2GO SEED --parseTaxonNames false"
tax="-a2t $mapping/prot_acc2tax-Nov2016X.abin"
kegg="-a2kegg $mapping/acc2kegg-Nov2016X-ue.abin"
eggnog="-a2eggnog $mapping/acc2eggnog-Nov2016X.abin"
interpro="-a2interpro2go $mapping/acc2interpro-Nov2016XX.abin"
seed="-a2seed $mapping/acc2seed-May2015XX.abin"
~/software/megan6ue/tools/daa-meganizer $input $options $tax $kegg $eggnog $interpro $seed

This may seem crazy simple, but is there a way to simply have the bash script then save/export the meganized summary files (ideally) into a new sub-directory, without having to call JavaApplicationStub?

I would love to iterate through 40 samples with daa-meganizer, then save the megan summary files for export to my computer/hard drive…


Once you have a meganized-DAA file, then you can run the daa-info tool to extract a MEGAN summary file.
Or you can use the compute-comparison tool to compute a comparison file for multiple meganized-DAA files.

1 Like

Thanks for the reply Daniel. I eventually did it the old fashioned way… clicking through lots of files and exporting megan summaries.

That said, I had written a little bash script to meganize 40+ .daa files on a compute cluster. Had I known about daa-info and compute-comparison, would this work as well? Replacing e.g. daa-meganizer with daa-info.

Thanks much!

#!/usr/bin/env bash

#SBATCH --job-name=meganizing
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=64
#SBATCH --time=12:00:00
#SBATCH --output=/projects/soil_ecology/dr359/UM/samsa2/output_files/meganize_out
#SBATCH --chdir=/projects/soil_ecology/dr359/UM/samsa2

source “bash_scripts/lib/”



if ls $INPUT_DIR/.daa &>/dev/null; then
for file in $INPUT_DIR/
megan_app/tools/daa-meganizer -i $file -mdb megan/databases/megan-map-Feb2022-ue.db -t $threads

You always have to first run daa-meganizer because that program computes the classifications of the reads. Only then can you use daa2info to extract summaries or compute-comparison to compute a comparison of samples.