Skip to contents

Executes DIAMOND, or a blastn, blastp, or tblastx algorithm.

Usage

run_blast(
  query,
  subject,
  out = NULL,
  tool_path = NULL,
  evalue = "1E-3",
  max_target_seqs = 10000,
  num_threads = 1,
  always_make_db = FALSE,
  verbose = FALSE,
  algorithm = "blastp"
)

run_diamond(
  query,
  subject,
  out = NULL,
  sensitivity = "default",
  iterate = FALSE,
  tool_path = NULL,
  evalue = "1E-3",
  max_target_seqs = 10000,
  num_threads = 1,
  always_make_db = FALSE,
  verbose = FALSE
)

Arguments

query

Path to a FASTA file containing sequences to query.

subject

Path to a FASTA file or database. If a FASTA file is given, a database will be created first, in the same folder as out. File paths to databases should be without the file extension.

out

Output file/ folder path. If the path is an existing folder, a file name will be generated to save the hits to, based on the query and subject file names.

tool_path

Path to the folder containing the blast+ executables. Only necessary if they are not in the $PATH variable.

evalue

Expectation value (E) threshold for hits

max_target_seqs

The maximum amount of hits per query

num_threads

Number of threads (CPUs) to use in the BLAST search

always_make_db

Logical. If TRUE, always treats subject as a FASTA file and make a database from it first. If FALSE, it will check for the presence of a database first, and only creates one when necessary.

verbose

Logical. If TRUE, reports timings when it starts making the subject database and when it executes the actual DIAMOND/ BLAST command.

algorithm

Choice of BLAST algorithm, one of: "blastp", "blastp-fast", "blastp-short", "blastn", "blastn-short", "megablast", "dc-megablast", or "tblastx"

sensitivity

Choice of the sensitivity option used by DIAMOND, one of: "fast", "default", "mid-sensitive", "sensitive", "more-sensitive", "very-sensitive", or "ultra-sensitive".

iterate

Logical. If TRUE, DIAMOND will start looking for hits using the fast sensitivity setting, increasing the sensitivity if no hits are found until the target sensitivity is reached. Also causes it to stop searching after the first hit is found for each query.

Value

Returns nothing (invisible NULL).

Details

This function requires that the command-line implementation of the tool of choice is installed. In the case of BLAST, this is blast+. Uses every sequence in the FASTA file provided by query as queries, aligning them against the reference database provided by subject.

evalue, max_target_seqs, and num_threads are all directly passed to DIAMOND/ BLAST as arguments. See their respective documentations for further details.

If out is left as NULL, the current working directory will be used instead. The file name will be generated based on the query and subject file names.

Author

Mike Puijk

Examples

if (FALSE) { # \dontrun{
## Running BLAST with the blastp algorithm
run_blast(query = "query.fa",
          subject = "subject.fa",
          out = "query_subject",
          algorithm = "blastp",
          tool_path = "path/to/blast/executable")

## Running DIAMOND with the "very-sensitive" option
run_diamond(query = "query.fa",
            subject = "subject.fa",
            out = "query_subject",
            sensitivity = "very-sensitive",
            tool_path = "path/to/diamond/executable")
} # }