I am new to this program, and I am running it to see the blastn results of my assembled contigs. Then, I found that with the Long read inspector, the table (including read, length, assignment, %cover, and #alignments) is the one I would like to export as a table. However, I tried to find how to export that, and it was in vain. I think I might missed something. Thus, could anyone help me to export this table?
It is currently not possible to export this to a file, but as a work around, use “select all” to select the all rows and then “paste” into a text editor
I have just added a new File->Export->Selection…
feature that can be used to export whatever is selected in the long read inspector to a file, it will available with the next update later this week
It works with the new version. Thank you so much for the new feature!
May I suggest another possible feature for next version? On the table of the Long read inspector, maybe it will be worthy to put the identity% for the blast results correlated with the Assignment column (or the results with the highest coverage). It can be a useful feature for the users.
Thank you so much for developing this wonderful program!
I managed to cut and paste my long read inspector data to notepad ++. I was wondering if you could explain what each number means? Matches is the number of alignments to the respective contig, but what does the bigger number mean?
The larger number in brackets is the “class size”. The interpretation depends on which “read assignment mode” is set. In your case, this is set to “aligned bases”. So, in your case, the number is the total number of aligned bases.
Hi Daniel,
Would one use the total number of aligned bases to see how much of a genome was recovered by X number of contigs?
I am wondering how I can use Megan to extract contigs for genome assembly? It is best to extract all the contigs that align to a particular virus family or genus, or species then download the reference genomes and run a genome alignment?
You can also set the read assignment mode to “bases”, which counts the based assigned to a taxon, not just the aligned bases. We introduced the “number of aligned bases” for early versions of Nanopore sequencing, where reads sometimes appeared to contain large stretches of “garbage” bases… It seemed safer to only count bases that align to something…
I’m not sure I understand what you mean by extract contigs for assembly? Do you mean long reads rather than contigs? Or do you want to try to assemble the (already assembled) contigs? Either way, MEGAN allows you to select nodes in the taxonomy view and then save all assigned reads/contigs to a file. You can use %t and %i in the supplied file name to put things into files whose name contain the taxon name (place holder %t) and/or taxon id (place holder %i).
I mean can I export only the contigs from the viral families of interest if I want to do an alignment with a reference genome using bowtie (or some other alignment tool). Like if I have alot of alignments to one particular virus, I would want to extract those contigs and align it to a reference genome.
If you want to save the contigs assigned to a particular node, select the node in the main taxonomy viewer (or one of the other classification viewers) and then use the following menu item:
File->Extract Reads…
This will place all reads (or contigs) assigned to one or more selected nodes to text files. You can use place-holders %f, %t and %i in the specify file names to have reads to different files. The place holders are replaced by input file name, class name and class id, respectively.
Hi Daniel,
Is there a way to export all a taxonomic information via command line version of MEGAN or Windows version that gives me a .tsv or .csv file similar to DIAMOND? Diamond produces a perfect .tsv table from all my blastx hits against the NCBI database, but did not provide order, subfamily, family, genus or species, so I ended up meganizing my blastx via Diamond into a .daa file and uploaded it to MEGAN, but I still can’t figure out how to generate the same table as Diamond with all the taxonomic information from the NCBI database. Can MEGAN do this? Below is an example
qseqid
sseqid
pident
length
mismatch
evalue
bitscore
staxids
sscinames
sskingdoms
skingdoms
sphylums
stitle
tig00000086
YP_001426684.1
100
40
0
4.50E-19
84.3
322019
Acanthocystis turfacea chlorella virus 1
Viruses
Bamfordvirae
Nucleocytoviricota
YP_001426684.1 ubiquitin family protein [Acanthocystis turfacea chlorella virus 1]
Hi @Daniel Is it possible to get this information (see my example below), but add family, subfamily, genus and specie? Below is my output from Diamond, but Diamond does not report family , subfamily, genus and specie yet. I think they are working on it. This is why I meganized my samples in Diamond to a .daa file, because I saw that MEGAN gave me the family, subfamily, genus and specie, but I can’t get it to generate a similar output like Diamond with a perfect .tsv file that I can open up into excel so I can us R Studio for statistics. I have way to many blastx hits to add all the families by hand. I have >10k hits! If you could help me figure this out I would be forever grateful!.