NCBI BLAST nt database configuration with dadasnake: Example config.yaml files for use with BLAST

Hi Anna and coauthors, thanks in advance for any advice. I really like the pipeline and could use some help getting it to work with using BLAST and NCBI's nt database. I am having issues getting the correct config settings for using NCBI nt database and taxdb as reference databases for COI. 

What are the appropriate config parameters to use NCBI's nt database and taxonomy (taxdb) as reference for a marker like COI?
Could you provide an example config.yaml file that uses Blast nt database as the reference db? 

I am able to run the pipeline, but am getting errors at the blastn_cluster step. Specifically, the name of the blast database is 'nt', but because the NCBI nt database is so big there is not a single file named 'nt' but many files with nt.XXX. I am getting the error in logs/blastn_cluster.log. It appears the issues are with the makeblastdb step in blastn_cluster. The database is already made and in a local directory. I have the NCBI nt and taxdump database installed locally and following installation instructions from BASTA as linked in the dadasnake installation instructions. 


#Here are the errors I'm getting. 
# BLAST options error: File /home/jwhitney/dadasnake/DBs/blastdbs/nt does not exist. 

#     log: logs/blastn_cluster.log (check log file(s) for error message)
#     conda-env: /home/jwhitney/programs/dadasnake/conda/66132e6a149ec730ec4c2d24861f8d4c
#     shell:
       
#             if [ -s clusteredTables/consensus.fasta ]; then
#               if [ ! -f "/home/jwhitney/dadasnake/DBs/blastdbs/nt.nin" ]
#                 then
#                 makeblastdb -dbtype nucl -in /home/jwhitney/dadasnake/DBs/blastdbs/nt                  -out /home/jwhitney/dadasnake/DBs/blastdbs/nt &> logs/blastn_cluster.log
#               fi
#               blastn -db /home/jwhitney/dadasnake/DBs/blastdbs/nt                -query clusteredTables/consensus.fasta -outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore staxids stitle" -out clusteredTables/blast_results.tsv -max_target_seqs 10 &>> logs/blastn_cluster.log
#             else
#               touch clusteredTables/blast_results.tsv
#             fi
           
#         (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

----------------
And here are the relevant parts of the config.yaml
# SETTINGS FOR TAXONOMIC ANNOTATION
taxonomy:
  dada:
    do: TRUE

# classification is only done, if do_taxonomy is true
taxonomy:
  mothur:
    do: FALSE
    db_path: "/home/jwhitney/.basta/taxonomy"
    tax_db: ""

blast:
  do: true
# blast is only done, if do_taxonomy is true
  run_on:
    - ASV
    - cluster
  db_path: "/home/jwhitney/dadasnake/DBs/blastdbs"
  tax_db: "nt"
  e_val: 0.01
  tax2id: ""
  all: true
  max_targets: 10
  run_basta: true
  basta_db: "/home/jwhitney/.basta/taxonomy"
  basta_e_val: 0.00001
  basta_alen: 100
  basta_number: 0
  basta_min: 3
  basta_id: 80
  basta_besthit: true
  basta_perchits: 99

-----------
Thanks in advance for any advice. 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NCBI BLAST nt database configuration with dadasnake: Example config.yaml files for use with BLAST #35

BLAST options error: File /home/jwhitney/dadasnake/DBs/blastdbs/nt does not exist.

log: logs/blastn_cluster.log (check log file(s) for error message)

conda-env: /home/jwhitney/programs/dadasnake/conda/66132e6a149ec730ec4c2d24861f8d4c

shell:

if [ -s clusteredTables/consensus.fasta ]; then

if [ ! -f "/home/jwhitney/dadasnake/DBs/blastdbs/nt.nin" ]

then

makeblastdb -dbtype nucl -in /home/jwhitney/dadasnake/DBs/blastdbs/nt -out /home/jwhitney/dadasnake/DBs/blastdbs/nt &> logs/blastn_cluster.log

fi

blastn -db /home/jwhitney/dadasnake/DBs/blastdbs/nt -query clusteredTables/consensus.fasta -outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore staxids stitle" -out clusteredTables/blast_results.tsv -max_target_seqs 10 &>> logs/blastn_cluster.log

else

touch clusteredTables/blast_results.tsv

fi

(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

SETTINGS FOR TAXONOMIC ANNOTATION

classification is only done, if do_taxonomy is true

blast is only done, if do_taxonomy is true

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

NCBI BLAST nt database configuration with dadasnake: Example config.yaml files for use with BLAST #35

Description

BLAST options error: File /home/jwhitney/dadasnake/DBs/blastdbs/nt does not exist.

log: logs/blastn_cluster.log (check log file(s) for error message)

conda-env: /home/jwhitney/programs/dadasnake/conda/66132e6a149ec730ec4c2d24861f8d4c

shell:

if [ -s clusteredTables/consensus.fasta ]; then

if [ ! -f "/home/jwhitney/dadasnake/DBs/blastdbs/nt.nin" ]

then

makeblastdb -dbtype nucl -in /home/jwhitney/dadasnake/DBs/blastdbs/nt -out /home/jwhitney/dadasnake/DBs/blastdbs/nt &> logs/blastn_cluster.log

fi

blastn -db /home/jwhitney/dadasnake/DBs/blastdbs/nt -query clusteredTables/consensus.fasta -outfmt "6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore staxids stitle" -out clusteredTables/blast_results.tsv -max_target_seqs 10 &>> logs/blastn_cluster.log

else

touch clusteredTables/blast_results.tsv

fi

(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

SETTINGS FOR TAXONOMIC ANNOTATION

classification is only done, if do_taxonomy is true

blast is only done, if do_taxonomy is true

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions