Hello,
I’ve just tested converting a SAM output from minimap2. I used sam2rma
with the following command:
sam2rma -i chunk0-1k.sam -r m64015_190924_232542.Q20.fasta_chunk_0000000-1k -lg -alg longReads -t 32 -mdb megan-nucl-map-May2020.db
These are PacBio HiFi reads that have been aligned to the NCBI nt database.
Here is the output on screen:
SAM2RMA6 - Computes a MEGAN RMA (.rma) file from a SAM (.sam) file that was created by DIAMOND or MALT
Options:
Input
--in: chunk0-1k.sam
--reads: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
Output
--out: chunk0-1k-TEST.rma
--useCompression: true
Reads
--paired: false
--pairedSuffixLength: 0
Parameters
--longReads: true
--maxMatchesPerRead: 100
--classify: true
--minScore: 50.0
--maxExpected: 0.01
--topPercent: 10.0
--minSupportPercent: 0.05
--minSupport: 0
--minPercentReadCover: 0.0
--minPercentReferenceCover: 0.0
--lcaAlgorithm: longReads
--lcaCoveragePercent: 100.0
--readAssignmentMode: alignedBases
Classification support:
--mapDB: /home/dportik/programs/megan/db/megan-nucl-map-May2020.db
Deprecated classification support:
--parseTaxonNames: true
--firstWordIsAccession: true
--accessionTags: gb| ref|
Other:
--threads: 32
--verbose: true
Version MEGAN Community Edition (version 6.19.4, built 16 Jul 2020)
Author(s) Daniel H. Huson
Copyright (C) 2020 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Loading ncbi.map: 2,259,889
Loading ncbi.tre: 2,259,893
Current SAM file: chunk0-1k.sam
Reads file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
Output file: chunk0-1k-TEST.rma
Classifications: Taxonomy
Generating RMA6 file Parsing matches
Annotating RMA6 file using FAST mode (accession database and first accession per line)
Parsing file chunk0-1k.sam
Parsing file: chunk0-1k.sam
Input domination filter: MinPercentCoverToStronglyDominate=90.0 and TopPercentScoreToStronglyDominate=90.0
WARNING: Failed to find read 'm64015_190924_232542/23/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/28/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/29/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/31/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/32/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/36/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/37/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/38/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/39/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/40/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/41/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/42/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/46/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/48/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/49/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/54/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/55/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/58/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/59/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/60/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/61/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/66/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/67/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/68/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/71/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/75/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/76/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/77/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/83/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/93/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/98/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/102/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/103/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/104/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/105/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/106/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/119/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/122/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/123/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/131/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/132/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/134/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/136/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/144/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/147/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/149/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/150/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/160/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/163/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
WARNING: Failed to find read 'm64015_190924_232542/165/ccs' in file: m64015_190924_232542.Q20.fasta_chunk_0000000-1k
No further 'failed to find read' warnings...
10% 20% 30% 40% 50% 60% 70% 80% 90% 100% (3.0s)
Total reads: 590
Alignments: 630
100% (0.0s)
Binning reads: Initializing...
Initializing binning...
Using 'Interval-Union-LCA' algorithm (100.0 %) for binning: Taxonomy
Binning reads...
Binning reads: Analyzing alignments
Total reads: 590
Total weight: 1,292,223
With hits: 293
Alignments: 630
Assig. Taxonomy: 286
MinSupport set to: 646
Binning reads: Applying min-support & disabled filter to Taxonomy...
Min-supp. changes: 9
Binning reads: Writing classification tables
Numb. Tax. classes: 33
Binning reads: Syncing
Class. Taxonomy: 33
100% (1.9s)
Total time: 12s
Peak memory: 3.1 of 97.7 G
I saw on other posts that this error can be thrown if the alignments are not in the same order as the reads. I’ve checked and the alignments appear in the same order as the reads, and they have the same labels. What seems odd is that all the initial alignments are ignored up to a certain point, then it seems to find the correct alignment-to-reads pairs again.
I include the full reads, truncated SAM, and output RMA here:
m64015_190924_232542.Q20.fasta_chunk_0000000-1k (59.2 KB)
chunk0-1k-truncated.sam (1.2 MB)
chunk0-1k-TEST.rma (1.7 MB)
Is this possibly a bug in sam2rma
, or is there an issue with my input files?
Thanks,
Dan