Sam2rma - default value of readAssignmentMode?

dportik · August 17, 2020, 11:56pm

Hi Daniel,
I have been creating RMA files successfully with sam2rma, and it is handling very large datasets quite well. Thanks for providing this alongside MEGAN.

I think there may be a misspecification of the default setting for --readAssignmentMode:

-ram, --readAssignmentMode [string] Set the read assignment mode. Default value: readCount Legal values: readCount, readLength, alignedBases, readMagnitude

After making RMA files using the default for this flag, I noticed that total aligned bases are showing up in the plots, rather than read counts. Checking the log files, I see:

SAM2RMA6 - Computes a MEGAN RMA (.rma) file from a SAM (.sam) file that was created by DIAMOND or MALT
Options:
Input
–in: 1103_V2_D1.merged.sam
–reads: 1103_V2_D1.sorted.fasta
Output
–out: 1103_V2_D1.rma
–useCompression: true
Reads
–paired: false
–pairedSuffixLength: 0
Parameters
–longReads: true
–maxMatchesPerRead: 100
–classify: true
–minScore: 50.0
–maxExpected: 0.01
–topPercent: 10.0
–minSupportPercent: 0.05
–minSupport: 0
–minPercentReadCover: 0.0
–minPercentReferenceCover: 0.0
–lcaAlgorithm: longReads
–lcaCoveragePercent: 100.0
–readAssignmentMode: alignedBases
Classification support:
–mapDB: megan-nucl-map-Jul2020.db
Deprecated classification support:
–parseTaxonNames: true
–firstWordIsAccession: true
–accessionTags: gb| ref|
Other:
–threads: 32
–verbose: true
Version MEGAN Community Edition (version 6.19.4, built 16 Jul 2020)

The general command I’ve been using is:

sam2rma -i SAM -r READS -o OUTPUT -lg -alg longReads -t 32 -mdb DATABASE -v 2> LOG

The log file seems to indicate that the default is actually alignedBases. I just wanted to check to see if this is the expected behavior, or if something about my SAM files caused a switch in the default?

I can certainly re-make the RMA files using -ram readCount . However, it can take several hours to create each RMA file, and this may save some time moving forward.

Thanks,
Dan

Daniel · August 27, 2020, 6:00am

The usage message is misleading and I will fix it.

The default value is:

read counts in short read mode
aligned bases in long read mode