|
| 1 | +--- |
| 2 | +title: "Simulation and Inference" |
| 3 | +author: "George G. Vega Yon" |
| 4 | +date: "August 18, 2021" |
| 5 | +output: rmarkdown::html_vignette |
| 6 | +vignette: > |
| 7 | + %\VignetteIndexEntry{Vignette Title} |
| 8 | + %\VignetteEngine{knitr::rmarkdown} |
| 9 | + %\VignetteEncoding{UTF-8} |
| 10 | +--- |
| 11 | + |
| 12 | +```{r setup, include=FALSE} |
| 13 | +knitr::opts_chunk$set(fig.width = 6, fig.height = 6, fig.align = "center") |
| 14 | +``` |
| 15 | + |
| 16 | + |
| 17 | +This is a simple example of simulation, inference, and prediction with the `aphylo` R package. |
| 18 | + |
| 19 | +## Setup |
| 20 | + |
| 21 | +```{r} |
| 22 | +library(aphylo) |
| 23 | +
|
| 24 | +# Parameter estimates |
| 25 | +psi <- c(.05, .025) |
| 26 | +mu_d <- c(.2, .1) |
| 27 | +mu_s <- c(.1, .05) |
| 28 | +Pi <- .5 |
| 29 | +``` |
| 30 | + |
| 31 | +## Simulation |
| 32 | + |
| 33 | +```{r data-sim} |
| 34 | +set.seed(223) |
| 35 | +x <- raphylo(n = 200, psi = psi, mu_d = mu_d, mu_s = mu_s, Pi = Pi) |
| 36 | +plot(x) |
| 37 | +``` |
| 38 | + |
| 39 | +The simulation function generates an `aphylo` class object which is simply a wrapper containing: |
| 40 | + |
| 41 | +- a 0/1 integer matrix (annotations), |
| 42 | +- a `phylo` tree (from the **ape** package), and |
| 43 | +- some information about the tree and annotations. |
| 44 | + |
| 45 | +If needed, we can export the data as follows: |
| 46 | + |
| 47 | +```{r eval=FALSE} |
| 48 | +# Edgelist describing parent->offspring relations |
| 49 | +edges <- x$tree$edge |
| 50 | +colnames(edges) <- c("parent", "offspring") |
| 51 | +write.csv(edges, file = "edgelist.csv", row.names = FALSE) |
| 52 | +
|
| 53 | +# Tip annotations |
| 54 | +ann <- with(x, rbind(tip.annotation, node.annotation)) |
| 55 | +rownames(ann) <- 1L:nrow(ann) |
| 56 | +write.csv(ann, file = "annotations.csv", row.names = FALSE) |
| 57 | +
|
| 58 | +# Event types |
| 59 | +events <- with(x, cbind(c(tip.type*NA, node.type))) |
| 60 | +rownames(events) <- 1:nrow(events) |
| 61 | +write.csv(events, file = "events.csv", row.names = FALSE) |
| 62 | +``` |
| 63 | + |
| 64 | + |
| 65 | +## Inference |
| 66 | + |
| 67 | +To fit the data, we can use MCMC as follows: |
| 68 | + |
| 69 | +```{r inference, cache=TRUE} |
| 70 | +ans <- aphylo_mcmc(x ~ psi + mu_d + mu_s + Pi) |
| 71 | +ans |
| 72 | +``` |
| 73 | + |
| 74 | +For goodness-of-fit analysis, we have a couple of tools. We can compare the predicted values with the observed values: |
| 75 | + |
| 76 | +```{r plot-predict} |
| 77 | +plot(ans) |
| 78 | +``` |
| 79 | + |
| 80 | +We can also take a look at the surface of the posterior function |
| 81 | + |
| 82 | +```{r plot-loglike} |
| 83 | +plot_logLik(ans) |
| 84 | +``` |
| 85 | + |
| 86 | +And we can also take a look at the prediction scores |
| 87 | + |
| 88 | +```{r predict-score} |
| 89 | +ps <- prediction_score(ans) |
| 90 | +ps # Printing |
| 91 | +plot(ps) # and plotting |
| 92 | +``` |
| 93 | + |
| 94 | + |
0 commit comments