Skip to content

Latest commit



191 lines (158 loc) · 8.85 KB

File metadata and controls

191 lines (158 loc) · 8.85 KB

Testing the new ape R package

George G. Vega Yon 2017-09-07

The following document presents some results from testing the new ape::read.tree function from the ape package version The first test consists basically on reading a tree with a singleton, which is a new feature of the package. The second tests compares the performance in terms of time that ape::read.tree takes relative to rncl::read_newick_phylo which has show to be significantly faster.


The data used here is the PANTHER database version 11.1 (which you can get here). You can download a more recent version from the pantherdb website.


We use the microbenchmark R package together with the ape and rncl R packages

Reading trees with singletons

This is a tree that I know for sure that has a singleton. Previous version of ape (before 4.1-0.14) returned with an error.

a_tree_with_singleton <- "

# Reading using the ape 4.1-0.14 -----------------------------------------------
ans_ape  <- ape::read.tree(text=a_tree_with_singleton)

# Reading using rncl 0.8.2 -----------------------------------------------------

# For some weird reason, rncl::read_newick_phylo prints a progress bar when
# the document is been knitted.
tmptree <- tempfile()
cat(a_tree_with_singleton, file=tmptree)
  ans_rncl <- rncl::read_newick_phylo(tmptree)
## [1] TRUE
# APE should have an extra node
## Phylogenetic tree with 50 tips and 47 internal nodes.
## Tip labels:
##  AN14, AN15, AN17, AN18, AN20, AN21, ...
## Rooted; includes branch lengths.
## Phylogenetic tree with 50 tips and 46 internal nodes.
## Tip labels:
##  AN14, AN15, AN17, AN18, AN20, AN21, ...
## Unrooted; includes branch lengths.
# Which is a singleton!
## [1] TRUE
## [1] FALSE

Speed benchmark

For this benchmark, we will read all 13096 phylogenetic trees available in PANTHER 11.1

# A function to (partially) read PANTHER trees
read_panther <- function(x, f) {
  x       <- readLines(x, n=1L)
  tmptree <- tempfile()
  cat(x, file=tmptree)
  ans <- f(tmptree)

# Checking how it works with 100 samples
n            <- length(tree.files) # 2000
tree.samples <- sample(tree.files, n)

cl <- makeForkCluster(10)

ans <- parLapply(cl, tree.samples, function(tree) {
    ape  = read_panther(tree, ape::read.tree),
    rncl = read_panther(tree, rncl::read_newick_phylo),
    times = 1, unit = "ms"

tab <- lapply(ans, function(x) {
  # A bad way of reshaping the data...
  with(x, data.frame(
    ape  = time[expr == "ape"],
    nrcl = time[expr == "rncl"]

tab <-, tab)

While ape seems to be faster, it is not significantly faster than rncl.

Session info

##  setting  value                       
##  version  R version 3.4.1 (2017-06-30)
##  system   x86_64, linux-gnu           
##  ui       X11                         
##  language (EN)                        
##  collate  en_US.UTF-8                 
##  tz       America/New_York            
##  date     2017-09-07                  
##  package        * version  date       source        
##  ape              4.1-0.14 2017-09-07 local         
##  assertthat       0.2.0    2017-04-11 CRAN (R 3.4.0)
##  backports        1.1.0    2017-05-22 CRAN (R 3.4.0)
##  base           * 3.4.1    2017-06-30 local         
##  colorspace       1.3-2    2016-12-14 CRAN (R 3.4.0)
##  compiler         3.4.1    2017-06-30 local         
##  datasets       * 3.4.1    2017-06-30 local         
##  devtools         1.13.3   2017-08-02 CRAN (R 3.4.0)
##  digest           0.6.12   2017-01-27 CRAN (R 3.4.0)
##  evaluate         0.10.1   2017-06-24 CRAN (R 3.4.0)
##  ggplot2          2.2.1    2016-12-30 CRAN (R 3.4.0)
##  graphics       * 3.4.1    2017-06-30 local         
##  grDevices      * 3.4.1    2017-06-30 local         
##  grid             3.4.1    2017-06-30 local         
##  gtable           0.2.0    2016-02-26 CRAN (R 3.4.0)
##  htmltools        0.3.6    2017-04-28 CRAN (R 3.4.0)
##  knitr            1.17     2017-08-10 CRAN (R 3.4.0)
##  lattice          0.20-35  2017-03-25 CRAN (R 3.4.1)
##  lazyeval         0.2.0    2016-06-12 CRAN (R 3.4.0)
##  magrittr         1.5      2014-11-22 CRAN (R 3.4.0)
##  memoise          1.1.0    2017-04-21 CRAN (R 3.4.0)
##  methods        * 3.4.1    2017-06-30 local         
##  microbenchmark * 1.4-2.1  2015-11-25 CRAN (R 3.4.0)
##  munsell          0.4.3    2016-02-13 CRAN (R 3.4.0)
##  nlme             3.1-131  2017-02-06 CRAN (R 3.4.1)
##  parallel         3.4.1    2017-06-30 local         
##  plyr             1.8.4    2016-06-08 CRAN (R 3.4.0)
##  prettyunits      1.0.2    2015-07-13 CRAN (R 3.4.1)
##  progress         1.1.2    2016-12-14 CRAN (R 3.4.1)
##  R6               2.2.2    2017-06-17 CRAN (R 3.4.0)
##  Rcpp             0.12.12  2017-07-15 CRAN (R 3.4.0)
##  rlang            0.1.2    2017-08-09 CRAN (R 3.4.0)
##  rmarkdown        1.6      2017-06-15 CRAN (R 3.4.0)
##  rncl             0.8.2    2016-12-16 CRAN (R 3.4.1)
##  rprojroot        1.2      2017-01-16 CRAN (R 3.4.0)
##  scales           0.5.0    2017-08-24 CRAN (R 3.4.0)
##  stats          * 3.4.1    2017-06-30 local         
##  stringi          1.1.5    2017-04-07 CRAN (R 3.4.0)
##  stringr          1.2.0    2017-02-18 CRAN (R 3.4.0)
##  tibble           1.3.4    2017-08-22 CRAN (R 3.4.0)
##  tools            3.4.1    2017-06-30 local         
##  utils          * 3.4.1    2017-06-30 local         
##  withr            2.0.0    2017-07-28 CRAN (R 3.4.0)
##  yaml             2.1.14   2016-11-12 CRAN (R 3.4.0)