Reorder dna_segs or labels to match a tree
permute_dna_segs.Rd
Takes a list of dna_seg
objects or dna_seg
labels and reorganizes them
based on a given (phylogenetic) tree.
Arguments
- dna_segs
Either a character vector containing
dna_seg
labels, or a list ofdna_seg
objects.- tree
A (phylogenetic) tree, in the form of a
phylo
orphylog
object, or a character string containing a file path to a Newick tree format file.- exact_match
Logical. If
TRUE
,dna_seg
labels will need to match the labels of the tree exactly. Ifexact_match = FALSE
, tree tip labels only need to contain thedna_seg
labels for a match to be found (e.g. thedna_seg
label"seq_1"
will match tree tip label"E_coli_seq_1.fa"
).- return_old_labels
Logical. If
TRUE
, then thedna_seg
labels will be returned using the original names provided by thedna_segs
argument. Only relevant whenexact_match = FALSE
, as this option can causedna_seg
labels to be changed to match the tree tip labels.
Value
A list of dna_seg
objects or a character vector of dna_seg
labels, matching the input given in the dna_segs
argument.
If return_old_labels = TRUE
, a list with 2 named elements will
be returned instead (dna_segs
, the same return value as above,
and old_labels
, a character vector of the original labels that is now
sorted).
Details
This function takes a character vector of dna_seg
labels, either directly,
or by extracting them from a list of dna_segs
, through the dna_segs
argument. Each of the labels is queried to find matching tree tip labels,
sorting them to match the order in which they are found in the tree. If
exactly 1 match is found, the dna_seg
label is updated to match the tree
tip label, unless exact_match = TRUE
. If multiple matches are found, an
error is returned that shows the offending dna_seg
label.
Examples
## Generate data
seg_labels <- c("seq_2", "seq_3", "seq_1", "seq_4")
tree_str <- paste0("(seq_1_B_bacilliformis:0.5,",
"(seq_2_B_grahamii:0.1,",
"(seq_3_B_henselae:0.1,",
"seq_4_B_quintana:0.2):0.1):0.1);")
tree <- ade4::newick2phylog(tree_str)
## Reorder and rename dna_seg labels to match tree
seg_labels
#> [1] "seq_2" "seq_3" "seq_1" "seq_4"
seg_labels <- permute_dna_segs(dna_segs = seg_labels, tree = tree)
seg_labels
#> [1] "seq_1_B_bacilliformis" "seq_2_B_grahamii" "seq_3_B_henselae"
#> [4] "seq_4_B_quintana"