Very few assigned reads in MEGAN

Hey guys,

I have some RNA-Seq reads that I want to blast and reconstruct metabolic pathways on. Since the RNA-seq results are actually DNA sequences, I used the DIAMOND Blast-Megan6-SEED annotation workflow to achieve my goal. But I am having some issues: with each samples having ~20000 aligned reads, I only got less than a hundred assigned in Megan. I have loaded the 2015 and 2016 GI mapping file (taxonomy) and SEED/KEGG files but I am still getting these errors. Any ideas guys?

Some samples lines in the m8 files:

SRR317086.69 gi|958165026|gb|ALQ41808.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|929734199|gb|ALF26823.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|929726675|gb|ALF19302.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|657895453|ref|WP_029598743.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|552906288|ref|WP_023037794.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|511278547|ref|WP_016339635.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|499325845|ref|WP_011016337.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|496071367|ref|WP_008795874.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|492644446|ref|WP_005909615.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|492587891|ref|WP_005896039.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|492573108|ref|WP_005890553.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|19703702|ref|NP_603264.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5 SRR317086.69 gi|958165026|gb|ALQ41808.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.69 gi|929734199|gb|ALF26823.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.69 gi|929726675|gb|ALF19302.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.69 gi|657895453|ref|WP_029598743.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.69 gi|552906288|ref|WP_023037794.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.69 gi|511278547|ref|WP_016339635.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.69 gi|499325845|ref|WP_011016337.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.69 gi|496071367|ref|WP_008795874.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.69 gi|492644446|ref|WP_005909615.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.69 gi|492587891|ref|WP_005896039.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.69 gi|492573108|ref|WP_005890553.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.69 gi|19703702|ref|NP_603264.1| 83.3 24 4 0 3 74 289 312 5.7e-06 45.1 SRR317086.500 gi|983310289|ref|WP_060495679.1| 45.0 20 11 0 11 70 289 308 6.3e+00 25.0 SRR317086.500 gi|644872521|ref|WP_025374958.1| 45.0 20 11 0 11 70 289 308 6.3e+00 25.0 SRR317086.500 gi|511541261|ref|WP_016361212.1| 45.0 20 11 0 11 70 289 308 6.3e+00 25.0 SRR317086.521 gi|983310289|ref|WP_060495679.1| 45.0 20 11 0 66 7 289 308 6.3e+00 25.0 SRR317086.521 gi|644872521|ref|WP_025374958.1| 45.0 20 11 0 66 7 289 308 6.3e+00 25.0 SRR317086.521 gi|511541261|ref|WP_016361212.1| 45.0 20 11 0 66 7 289 308 6.3e+00 25.0 SRR317086.527 gi|983524772|ref|WP_060676525.1| 45.8 24 13 0 3 74 875 898 6.1e+00 25.0 SRR317086.527 gi|983310337|ref|WP_060495727.1| 45.8 24 13 0 3 74 875 898 6.1e+00 25.0 SRR317086.527 gi|958165889|gb|ALQ42671.1| 45.8 24 13 0 3 74 875 898 6.1e+00 25.0 SRR317086.527 gi|929732382|gb|ALF25006.1| 45.8 24 13 0 3 74 875 898 6.1e+00 25.0 SRR317086.527 gi|19703859|ref|NP_603421.1| 45.8 24 13 0 3 74 875 898 6.1e+00 25.0

This is really the layout of the lines? The first tag should be the read name, but in what you posted, the read name appears at end of the previous line (except in the first line).

Thanks for the help! The format may be messed up in the post, and they actually looks like this :

SRR317086.69 gi|958165026|gb|ALQ41808.1| 69.6 23 7 0 75 7 349 371 1.8e-02 33.5

Is that good?

The format is fine. What reference database are you comparing against? Do you get more assignments using e.g. Interpro2Go?