Hi,
I am trying to use the read extractor tool in a customized script for extracting a specific taxonomy rank.
I found that if it does have an space in its name this Taxonomy name is separated in “words”. For example, if I want to extract the reads classified as “Weivirus-like virus sp.”:
/home/human/megan/tools/read-extractor -v -i Pool1-bats_merged_ok_v2.rma -c Taxonomy -n Weivirus-like virus sp. -b -o ./families/Pool1-families/Pool1-contigs-Weivirus-like-sp.fasta
ReadExtractorTool - Extracts reads from a DAA or RMA file by classification
Options:
Input and Output
–input: Pool1-bats_merged_ok_v2.rma
–output: ./families/Pool1-families/Pool1-contigs-Weivirus-like-sp.fasta
Options
–frameShiftCorrect: false
–classification: Taxonomy
–classNames: Weivirus-like virus sp.
–allBelow: true
–all: false
Other:
–ignoreExceptions: false
–gzipOutputFiles: true
–propertiesFile: /home/human/.MEGAN.def
–verbose: true
Version MEGAN Community Edition (version 6.25.3, built 15 Sep 2023)
Author(s) Daniel H. Huson
Copyright (C) 2023 Daniel H. Huson. This program comes with ABSOLUTELY NO WARRANTY.
Java version: 20.0.2; max memory: 7.8G
Loading ncbi.map: 2,396,736
Loading ncbi.tre: 2,396,740
Warning: unknown class: ‘Weivirus-like’
Warning: unknown class: ‘virus’
Warning: unknown class: ‘sp.’
Processing file: Pool1-bats_merged_ok_v2.rma
Extracting by Taxonomy
Writing to: ./families/Pool1-families/Pool1-contigs-Weivirus-like-sp.fasta
100% (0.0s)
Reads extracted: 0
Total time: 4.1s
Peak memory: 0 of 7.8G
Same output if I try putting " " or ’ ’ in the Taxonomy name option (p.e.: /home/human/megan/tools/read-extractor -v -i Pool1-bats_merged_ok_v2.rma -c Taxonomy -n “Weivirus-like virus sp.” -b -o ./families/Pool1-families/Pool1-contigs-Weivirus-like-sp.fasta)
Is there a way to fix this? I need it to read the Taxonomy name without separating it with spaces
Thanks in advance!