Latest Megan version blastx XML import failed (LR mode)

Hi Daniel,
I suspect some bugs in the blast import module in SR & LR mode, inducing zero output to the tree. Blastx XML input must be OK with 1,808 alignments, contains all the standard information. I’ve also checked older version 6.17.0, built 7 Aug 2019 with the separated taxonomy/functional DBs, without any problems!

Here’s the log:

Executing: show window=ImportBlast;
Executing: import blastFile=‘E:\4201.xml’ fastaFile=‘E:\4201.fa’ meganFile=‘E:\4201-1.rma6’ useCompression=false format=BlastXML mode=BlastX minScore=50.0 maxExpected=0.01 minPercentIdentity=0.0 topPercent=10.0 minSupportPercent=0.0 minSupport=1 lcaAlgorithm=longReads lcaCoveragePercent=80.0 minPercentReadToCover=0.0 minPercentReferenceToCover=0.0 minComplexity=0.0 useIdentityFilter=false readAssignmentMode=readCount fNames= longReads=true;
Executing: ‘import’‘blastFile’’=’‘E:\4201.xml’‘fastaFile’’=’‘E:\4201.fa’‘meganFile’’=’‘E:\4201-1.rma6’‘useCompression’’=’‘false’‘format’’=’‘BlastXML’‘mode’’=’‘BlastX’‘minScore’’=’‘50.0’‘maxExpected’’=’‘0.01’‘minPercentIdentity’’=’‘0.0’‘topPercent’’=’‘10.0’‘minSupportPercent’’=’‘0.0’‘minSupport’’=’‘1’‘lcaAlgorithm’’=’‘longReads’‘lcaCoveragePercent’’=’‘80.0’‘minPercentReadToCover’’=’‘0.0’‘minPercentReferenceToCover’’=’‘0.0’‘minComplexity’’=’‘0.0’‘useIdentityFilter’’=’‘false’‘readAssignmentMode’’=’‘readCount’‘fNames’’=’‘longReads’’=’‘true’;
Classifications: Taxonomy
Annotating RMA6 file using FAST mode (accession database and first accession per line)
Parsing file: E:\4201.xml
Total reads: 2,844
Alignments: 1,808
Initializing binning…
Using ‘Interval-Union-LCA’ algorithm (80.0 %) for binning: Taxonomy
Binning reads…
Total reads: 2,844
With hits: 480
Alignments: 1,808
Assig. Taxonomy: 0
Min-supp. changes: 0
Numb. Tax. classes: 2
Class. Taxonomy: 2
Info: Command completed (15s): ‘import’‘blastFile’’=’‘E:\4201.xml’‘fastaFile’’=’‘E:\4201.fa’'m…
Induced tree has 3 of 2,175,510 nodes
Induced tree has 3 of 2,175,510 nodes

Thank you: Balázs

Dear Balázs,

could you please send me a small example file and I will look into this.
Daniel

Dear Daniel,

I’ve sent that file via e-mail!
Thank you & kind regards!

Balázs

I can’t find it in my inbox, did you use daniel.huson@uni-tuebingen.de?

Dear Daniel,

sorry, I sent it to the general megan@… box, so I’ve resent it directly to you, plz. check it again.

Thank you:

Balázs

Sorry for the delay, I have taken a look at your file.

This is what an alignment looks like:

Is this the result of alignment against the NCBI-nr database? If it is not (and it doesn’t look like it is), then that would explain why MEGAN can’t map the reference sequences to taxa or functions.

Please consider using alignment against NCBI-nr (or a subset of that database). Perhaps use DIAMOND and output format 100.

Hi Daniel,

I’m using the official NCBI blastx with the subset of NR database to produce standard XML blast output format. (As you can see in the attachment). When I import that XML (and the query de novo contigs as “reads”) to the latest Megan versions with the unified “megan-map-Oct2019.db” mapping file the output is empty. All of the contigs go to the no hits/not assigned bins.
It is necessary using blastx (and other remote homology searching tools with blast compatible XML outputs) because we’re digging for dark matter and the diamond aligner not designed for this.
As I wrote, earlier versions of Megan (eg. 6.17.0, built 7 Aug 2019 with the separated taxonomy/functional DBs) performs perfectly without any problems. With that older version I can import XML/fasta contigs smoothly and the taxonomical binning is pretty nice (as you can see it on the figure below):

Thanks for any idea & help!

Bests: Balázs

Thank you for being so persistent… I have finally identified the bug and will upload a new release later today in which parsing of XML files and the use of a mapping DB file should play nicely together.
(The problem was that my mapping-db based parser only looks at the first word in a header line and I wasn’t putting the HitDef record at the beginning of the header line.)

Thank you very much! It’ll be an enormous help for us to keeping up-to-date and efficient!

Best regards: B

Let me know whether the new version 6_18_6 does indeed fix the problem.

Problem solved! Latest version works preety nice! Thanx!

Good day

I have the same issue with the xml files as mentioned above, the file is empty after import (having the same issues with meganized daa files where I get taxonomic data but no functional data). I am also using the megan-map-Oct2019 file with the 6_18_11 version available on the Megan6 download page. Where can I get this version 6_18_6 as this version apparently solves the problem?

Regards
Sunette

Older versions are usually available on the same website. You can get the older version by editing the URL associated with the most recent version, for example by replacing 11 by 6 in the URL for the current download as shown here:

Change
https://software-ab.informatik.uni-tuebingen.de/download/megan6/MEGAN_Community_macos_6_18_11.dmg
to
https://software-ab.informatik.uni-tuebingen.de/download/megan6/MEGAN_Community_macos_6_18_6.dmg