Thanks very much for responding previously and preparing those updated synonym files. I have been using them lately but have run into an issue that might possibly be related to the use of these files. To refresh, I am using MEGAN 6.13.1 on Windows 7. As indicated, I downloaded the most recent SILVA database files for SSU and LSU (SSURef_Nr99_132_tax_silva_to_NCBI_synonyms.map.gz and LSURef_132_tax_silva_to_NCBI_synonyms.map.gz) and their respective synonym files (SILVA_132_SSURef_Nr99_tax_silva.fasta.gz and SILVA_132_LSURef_tax_silva.fasta.gz). The issue is, when I have performed blastn on our samples, MEGAN assigned reads at much lower taxonomic levels than might be expected (LCA: min score 100, top percent 10 and min support 5).
To give one example, we have matches to polar bear at the species level and, you can take my word for it, there are no polar bears around here. Taking one of the reads, and performing blast against the whole of GenBank reveals a good match to sheep, which makes more sense. There are other similar examples too. Could this be a result of limitations with the database content or something else? I would have expected sheep sequence to be in the LSU database.
What also might be relevant is that the %identity of the matches is low for the assigned taxonomic level (e.g. 88% or 92% at species level). I have tried using the 16S filter (more than 99% identity for species level) but this make no difference to the final result – the assignment of the reads does not change after applying the filter. Could this be a problem with the synonym files or is there a bug with the filter? If I use the older “silva2ncbi.map” file, there appears to be less incorrectly assigned reads. However, the 16S filter does not seem to be work whether using the older synonyms file or not.
Please let me know if you need additional info to assist your enquiries. Many thanks for any effort you make on this problem. Hopefully you can help.