Web-blast XML output, cannot assign taxon ID

Hi,

I’m trying to rerun a subset of extracted reads that were BLAST/MEGAN identified with a local database again on the BLASTn website. I exported the output in XML format from web-blast and used that as the input for MEGAN (second image below). I used the megan-nucl-map-Jul2020.db mapping file (which works otherwise for BLAST files generated locally).

While it runs, nothing gets assigned. When I inspect the ‘Not Assigned’ node, it shows that the problem is that it can’t parse the web-blast XML format to grab the taxonomic ID (see the first screen snip below) . Any chance you could add perhaps a web-blast XML specific input format type to get this working?

I’ve attached examples of the XML format and the fasta file with these reads.

Fasta: Bos_extract.fas (36.5 KB)
XML file: https://www.dropbox.com/s/42uvx86dgd323zc/Bos_extract.xml?dl=0
Current rma6 file: https://www.dropbox.com/s/49949axdxsv0kza/Bos_extract-1.rma6?dl=0

Thanks for your help!

I can see why MEGAN’s “fast mode” doesn’t work here, as it currently expects the accession to be the first word of the reference sequence header line…

However, in this specific case, rather than using the mapping db, you can use “extended mode” and select “parse taxon names”, as that seems to work:

Alternatively, if you would rather using the mapping db, then select it in the Import from Blast dialog, but then afterward select “extended mode”. This will find the accessions and will make the assignments.

(Unfortunately, as the names suggest, fast mode is much faster than extended mode.)

Thanks! It’s working now :slight_smile: