I am currently working on an ancientDNA project, trying to determine if there are is any coral/symbiont DNA in data from coral reef cores. However, I am also interested in identifying the reads which are not coral/symbiont reads. In order to do this, I have built a MALT database containing ~1,600 genomes from invertebrates, bacteria, protozoa, and archaea. Ideally, I will pipe the output from this into the software HOPS to characterize the degradation patterns of the reads to see which are truly ancient. However I am having several issues:
- MALT keeps running into memory errors when trying to align my reads files. The reads files are large, but I have broken them into 5 smaller subsets (however I am still running into memory errors). I suspect this is due to the size of my database taking up most of the Java memory? I am not sure if this is the case though. If so, is there any way to fix this? If not, do you have a recommendation for another software that might be able to handle this kind of data (I’ve considered minimap2, but it seems that wouldn’t be ideal for short, potentially damaged reads)? I’ve already allotted MALT the maximum memory possible during the program’s installation.
- A large portion of my reads are returning ‘no hits’ or ‘not assigned’. Is there a best way to loosen the threshold in order to see even poor alignments? Currently I have my min Support Percent set to 0.
Thank you for any feedback!