I am trying to assign reads to functional categories using SEED and EggNog and used the approach described here, but with the ncbi env_nr instead of the nr database:
To work with a large number of large metagenomic shotgun samples, proceed as follows.
Let us assume that you have just received a hard-disk containing a collection of fastq.gz files, each representing one sample from your study.
Put these files into a directly called 00fastq.
Create a second directory called 10daa. This will contain DAA files generated by DIAMOND.
Create a third directory called 20rma. This will contain MEGAN RMA files.
For each file reads.fastq.gz in your 00fastq directory…
290 hits are found, but they are classified as “not assigned” in MEGAN, so I guess I am doing something wrong.
The commands I use:
diamond makedb --in env_nr.gz -p 10 -d eggnog
diamond blastx --query 00fastq/TestSample.fasta --db eggnog --daa 10daa/reads.daa
daa2rma -i 10daa/reads.daa -o 20rma/reads.rma --acc2eggnog acc2eggnog-Oct2016X.abin
daa-meganizer -i 10daa/reads.daa -a2eggnog acc2eggnog-Oct2016X.abin
Both givethe same result:
Total reads: 290
With hits: 290
Assig. Taxonomy: 0
Assig. EGGNOG: 0
MinSupport set to: 1
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (0.3s)
Min-supp. changes: 0
10% 20% 100% (0.1s)
Class. Taxonomy: 1
Class. EGGNOG: 1
Am I using a wrong command, or is that a env-nr database problem?
I’ve never used env_nr.gz… What do the entries look like, please send me the first 100 lines and I will determine whether it is a mapping file problem
Hello! Where did you find env_nr.gz? If you download the env_nr database from the BLAST website, it gives you 3 separate env_nr.tar.gz files with a bunch of different format files inside. Diamond needs a single .fa file to makedb, but env_nr doesn’t come like that. Any idea how to use env_nr with diamond? I tried some stuff from here
https://www.uppmax.uu.se/resources/databases/diamond-protein-alignment-databases/ but it doesn’t work. Thanks!
Please post this question on the Diamond GitHub webpage.