Perform sequence alignments using DIAMOND or BLAST
run_sequence_alignment.Rd
Executes DIAMOND, or a blastn, blastp, or tblastx algorithm.
Usage
run_blast(
query,
subject,
out = NULL,
tool_path = NULL,
evalue = "1E-3",
max_target_seqs = 10000,
num_threads = 1,
always_make_db = FALSE,
verbose = FALSE,
algorithm = "blastp"
)
run_diamond(
query,
subject,
out = NULL,
sensitivity = "default",
iterate = FALSE,
tool_path = NULL,
evalue = "1E-3",
max_target_seqs = 10000,
num_threads = 1,
always_make_db = FALSE,
verbose = FALSE
)
Arguments
- query
Path to a FASTA file containing sequences to query.
- subject
Path to a FASTA file or database. If a FASTA file is given, a database will be created first, in the same folder as
out
. File paths to databases should be without the file extension.- out
Output file/ folder path. If the path is an existing folder, a file name will be generated to save the hits to, based on the query and subject file names.
- tool_path
Path to the folder containing the blast+ executables. Only necessary if they are not in the
$PATH
variable.- evalue
Expectation value (E) threshold for hits
- max_target_seqs
The maximum amount of hits per query
- num_threads
Number of threads (CPUs) to use in the BLAST search
- always_make_db
Logical. If
TRUE
, always treatssubject
as a FASTA file and make a database from it first. IfFALSE
, it will check for the presence of a database first, and only creates one when necessary.- verbose
Logical. If
TRUE
, reports timings when it starts making the subject database and when it executes the actual DIAMOND/ BLAST command.- algorithm
Choice of BLAST algorithm, one of:
"blastp"
,"blastp-fast"
,"blastp-short"
,"blastn"
,"blastn-short"
,"megablast"
,"dc-megablast"
, or"tblastx"
- sensitivity
Choice of the sensitivity option used by DIAMOND, one of:
"fast"
,"default"
,"mid-sensitive"
,"sensitive"
,"more-sensitive"
,"very-sensitive"
, or"ultra-sensitive"
.- iterate
Logical. If
TRUE
, DIAMOND will start looking for hits using the fast sensitivity setting, increasing the sensitivity if no hits are found until the target sensitivity is reached. Also causes it to stop searching after the first hit is found for each query.
Details
This function requires that the command-line implementation of the tool of
choice is installed. In the case of BLAST, this is blast+
. Uses every
sequence in the FASTA file provided by query
as queries, aligning them
against the reference database provided by subject
.
evalue
, max_target_seqs
, and
num_threads
are all directly passed to DIAMOND/ BLAST as arguments. See
their respective documentations for further details.
If out is left as NULL
, the current working directory will be used
instead. The file name will be generated based on the query and subject
file names.
Examples
if (FALSE) { # \dontrun{
## Running BLAST with the blastp algorithm
run_blast(query = "query.fa",
subject = "subject.fa",
out = "query_subject",
algorithm = "blastp",
tool_path = "path/to/blast/executable")
## Running DIAMOND with the "very-sensitive" option
run_diamond(query = "query.fa",
subject = "subject.fa",
out = "query_subject",
sensitivity = "very-sensitive",
tool_path = "path/to/diamond/executable")
} # }