Pooled contigs and raw reads

Hi Megan community,

I have pooled-contigs assembled through Megahit and the raw-reads per sample. No taxonomy or functions have been annotated. Can I continue my analysis in Megan? Could you give me an idea of how to do that?

This is my current workflow:
I was told to map my reads to the contig file using bowtie2. Then convert the output from .sam to .bam and .bam.bai. Now for the pooled-contigs, I was told to use DIAMOND to add function and taxonomy, obtain .daa file and meganize. Where I get lost is how to import the reads.bam and reads.bam.bai and contigs.megan - is this even possible?

Thank you.

Run the contains through the DIAMOND+MEGAN pipeline to assign taxonomy and function to your contigs. You can’t use your bam files in this process.

Hi,

Looking at previous posts and the Megan paper I want to make sure I understood something correctly. I have paired end reads. Prior to DIAMOND alignment, I need to first create contigs with a program like metaSPAdes? I cannot run the DIAMOND alignment on a raw fastq.gz file I received from the sequencer?

Hi @jspychalla,

It depends on you. You can directly align your raw FASTQ files (QC is optional and depends on your dataset type) against the NCBI-nr database and then process the alignments using MEGAN. MEGAN will generate a count table, which you can explore within MEGAN itself or use this table to do statistical analyses

In the approach mentioned above, the authors assembled contigs from FASTQ files and generated an abundance table by mapping reads to the contigs, you will get something like below:

Contig Sample1 Sample2 Sample3
contig1 0.5 0.1 0.4
contig2 0.5 0.5 0.7
contig3 0.4 0.8 0.9

but contigs don’t have taxonomic and functional information :

  1. Use the contig sequences from contig.fasta.
  2. Align the contigs against the NCBI-nr database.
  3. Process the alignment results using MEGAN for taxonomic and functional binning.

Using MEGAN’s daa2info tool, you can extract taxonomic and functional information for your contigs and merge it with your abundance table. This will allow you to explore the data in MEGAN or perform statistical analyses on it.

Let me know if you need further clarification or assistance!

Best regards,
Anupam