Skip to content

Commit f72c16e

Browse files
committed
rm numDeriv dependency (closes #41)
1 parent 27880ab commit f72c16e

17 files changed

+61
-58
lines changed

DESCRIPTION

+3-4
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,16 @@ Title: Marginal Effects for Model Objects
44
Description: An R port of Stata's 'margins' command, which can be used to
55
calculate marginal (or partial) effects from model objects.
66
License: MIT + file LICENSE
7-
Version: 0.2.9
8-
Date: 2016-08-30
7+
Version: 0.2.10
8+
Date: 2016-08-31
99
Authors@R: c(person("Thomas J.", "Leeper",
1010
role = c("aut", "cre"),
1111
email = "thosjleeper@gmail.com"))
1212
Imports:
1313
stats,
1414
graphics,
1515
compiler,
16-
MASS,
17-
numDeriv
16+
MASS
1817
Suggests:
1918
sandwich,
2019
webuse,

NAMESPACE

+1-1
Original file line numberDiff line numberDiff line change
@@ -51,4 +51,4 @@ importFrom(graphics,points)
5151
importFrom(graphics,polygon)
5252
importFrom(graphics,rug)
5353
importFrom(graphics,segments)
54-
importFrom(numDeriv,grad)
54+
importFrom(utils,head)

NEWS.md

+4
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# CHANGES TO margins 0.3.0 #
22

3+
## margins 0.2.10
4+
5+
* Replaced `numDeriv::jacobian()` with an internal alternative. (#41)
6+
37
## margins 0.2.8
48

59
* Added a `prediction()` method for "ivreg" objects (from `AER::ivreg()`). (#3)

R/build_margins.R

+4-8
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,7 @@
77
#' @param vce A character string indicating the type of estimation procedure to use for estimating variances. The default (\dQuote{delta}) uses the delta method. Alternatives are \dQuote{bootstrap}, which uses bootstrap estimation, or \dQuote{simulation}, which averages across simulations drawn from the joint sampling distribution of model coefficients. The latter two are extremely time intensive.
88
#' @param iterations If \code{vce = "bootstrap"}, the number of bootstrap iterations. If \code{vce = "simulation"}, the number of simulated effects to draw. Ignored otherwise.
99
#' @param unit_ses If \code{vce = "delta"}, a logical specifying whether to calculate and return unit-specific marginal effect variances. This calculation is time consuming and the information is often not needed, so this is set to \code{FALSE} by default.
10-
#' @param method A character string indicating the numeric derivative method to use when estimating marginal effects variances. \dQuote{simple} optimizes for speed; \dQuote{Richardson} optimizes for accuracy. See \code{\link[numDeriv]{jacobian}} for details. Currently this is only used when calculating marginal effect variances.
11-
#' @param eps A numeric value specifying the \dQuote{step} to use when calculating numerical derivatives. By default this is the smallest floating point value that can be represented on the present architecture.
10+
#' @param eps A numeric value specifying the \dQuote{step} to use when calculating numerical derivatives.
1211
#' @param \dots Ignored.
1312
#' @details Generally, it is not necessary to call this function directly because \code{\link{margins}} provides a simpler interface. To just get marginal effects without building a \dQuote{margins} object, call \code{\link{marginal_effects}} instead, which handles the effect estimation of a model object without building a \dQuote{margins} object.
1413
#'
@@ -21,7 +20,6 @@
2120
#' @keywords models
2221
#' @import stats
2322
#' @importFrom compiler cmpfun
24-
#' @importFrom numDeriv grad
2523
#' @importFrom MASS mvrnorm
2624
#' @export
2725
build_margins <-
@@ -32,16 +30,14 @@ function(model,
3230
vce = c("delta", "simulation", "bootstrap", "none"),
3331
iterations = 50L, # if vce == "bootstrap" or "simulation"
3432
unit_ses = FALSE,
35-
method = c("simple", "Richardson", "complex"), # passed to marginal_effects()
36-
eps = 1e-7,
33+
eps = 1e-4,
3734
...) {
3835

3936
# variables in the model
4037
allvars <- all.vars(model[["terms"]])[-1]
4138

4239
# march.arg() for arguments
4340
type <- match.arg(type)
44-
method <- match.arg(method)
4541
vce <- match.arg(vce)
4642
if (is.function(vcov)) {
4743
vcov <- vcov(model)
@@ -53,12 +49,12 @@ function(model,
5349
# variance estimation technique
5450
variances <- get_effect_variances(data = data, model = model, allvars = names(mes),
5551
type = type, vcov = vcov, vce = vce,
56-
iterations = iterations, method = method)
52+
iterations = iterations, eps = eps)
5753

5854
# get unit-specific effect variances (take derivative of `.build_grad_fun()` for every row separately)
5955
if ((vce == "delta") && (isTRUE(unit_ses))) {
6056
vmat <- do.call("rbind", lapply(seq_len(nrow(data)), function(datarow) {
61-
delta_once(data = data[datarow,], model = model, type = type, vcov = vcov, method = method)
57+
delta_once(data = data[datarow,], model = model, type = type, vcov = vcov, eps = eps)
6258
}))
6359
colnames(vmat) <- paste0("se.", names(mes))
6460
vmat <- as.data.frame(vmat)

R/delta.R

+31-13
Original file line numberDiff line numberDiff line change
@@ -3,29 +3,47 @@ function(data,
33
model,
44
type = c("response", "link", "terms"),
55
vcov = vcov(model),
6-
method = c("simple", "Richardson", "complex")) {
6+
eps = 1e-4) {
77
# take the derivative of each marginal effect from a model with respect to each model coefficient
88

99
type <- match.arg(type)
10-
method <- match.arg(method)
1110
if (is.function(vcov)) {
1211
vcov <- vcov(model)
1312
}
1413

15-
# express each marginal effect as a function of all coefficients
16-
# holding data constant
17-
# this is what .build_grad_fun() will do
18-
# then: numDeriv::grad(.build_grad_fun(), model$coef)
19-
# gives `gradmat`, such that v %*% V %*% t(v)
14+
# express each marginal effect as a function of estimated coefficients
15+
# holding data constant (using `.build_grad_fun()`)
16+
# use `jacobian(.build_grad_fun(), model$coef)`
17+
# to get `jacobian`, an ME-by-beta matrix,
18+
# such that jacobian %*% V %*% t(jacobian)
2019
# gives the variance of each marginal effect
21-
# `gradmat` should be an ME-by-beta matrix
22-
23-
# Some references:
2420
# http://www.soderbom.net/lecture10notes.pdf
2521
# http://stats.stackexchange.com/questions/122066/how-to-use-delta-method-for-standard-errors-of-marginal-effects
2622

27-
FUN <- .build_grad_fun(data = data, model = model, type = type, method = method)
28-
gradmat <- numDeriv::jacobian(FUN, model[["coefficients"]], method = method)
29-
vout <- diag(gradmat %*% vcov %*% t(gradmat))
23+
FUN <- .build_grad_fun(data = data, model = model, type = type)
24+
#jacobian <- numDeriv::jacobian(FUN, model[["coefficients"]], method = "simple")
25+
jacobian <- jacobian(FUN, model[["coefficients"]], eps = eps)
26+
vout <- diag(jacobian %*% vcov %*% t(jacobian))
3027
return(vout)
3128
}
29+
30+
.build_grad_fun <- function(data, model, type = "response", eps = 1e-4) {
31+
32+
# factory function to return prediction holding data constant but varying coefficients
33+
FUN <- function(coefs) {
34+
model[["coefficients"]] <- coefs
35+
colMeans(marginal_effects(model = model, data = data, type = type, eps = eps), na.rm = TRUE)
36+
}
37+
return(compiler::cmpfun(FUN))
38+
}
39+
40+
jacobian <- function(FUN, coefficients, eps = 1e-4) {
41+
F0 <- FUN(coefficients)
42+
out <- matrix(NA_real_, nrow = length(F0), ncol = length(coefficients))
43+
for (i in seq_along(coefficients)) {
44+
coeftemp <- coefficients
45+
coeftemp[i] <- coeftemp[i] + eps
46+
out[, i] <- (FUN(coeftemp) - F0) / eps
47+
}
48+
out
49+
}

R/factories.R

-11
This file was deleted.

R/get_effect_variances.R

+2-3
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,11 @@ function(data = data,
66
vcov = vcov(model),
77
vce = c("delta", "simulation", "bootstrap", "none"),
88
iterations = 50L, # if vce == "bootstrap" or "simulation"
9-
method = c("simple", "Richardson", "complex"), # passed to marginal_effects()
9+
eps = 1e-4,
1010
...) {
1111

1212
# march.arg() for arguments
1313
type <- match.arg(type)
14-
method <- match.arg(method)
1514
vce <- match.arg(vce)
1615
if (is.function(vcov)) {
1716
vcov <- vcov(model)
@@ -24,7 +23,7 @@ function(data = data,
2423
} else if (vce == "delta") {
2524

2625
# default method
27-
variances <- delta_once(data = data, model = model, type = type, vcov = vcov, method = method)
26+
variances <- delta_once(data = data, model = model, type = type, vcov = vcov, eps = eps)
2827

2928
} else if (vce == "simulation") {
3029

R/marginal_effects_internal.R

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
get_instant_pdiff <- function(data, model, variable, type = c("response", "link"), eps = 1e-7) {
1+
get_instant_pdiff <- function(data, model, variable, type = c("response", "link"), eps = 1e-4) {
22
# @title Instantaneous change in fitted values (numerical derivative)
33
# @description This is an internal function used to calculate instantaneous change (numerical derivative) in y-hat between observed values in `data` and the smallest machine-precise change in the value of `data`. This is used by \code{marginal_effects} for numeric variables. It currently only uses the "simple" derivative method. This might change in the future
44
# @param data The dataset on which to to calculate `predict(model)` (and the slope thereof)

R/marginal_effects_methods.R

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#' @rdname marginal_effects
22
#' @export
3-
marginal_effects.lm <- function(model, data, type = c("response", "link"), eps = 1e-7, ...) {
3+
marginal_effects.lm <- function(model, data, type = c("response", "link"), eps = 1e-4, ...) {
44

55
# setup data, if missing
66
if (missing(data)) {

R/prediction.R

+2-1
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ prediction <- function(model, data, ...) {
3131
UseMethod("prediction")
3232
}
3333

34+
#' @importFrom utils head
3435
#' @export
3536
print.prediction <- function(x, digits = 4, ...) {
3637
f <- x[["fitted"]]
@@ -39,7 +40,7 @@ print.prediction <- function(x, digits = 4, ...) {
3940
m <- sprintf(paste0("%0.", digits, "f"), m)
4041
message(paste0("Average prediction: ", m, ", for ", length(f), " ", ngettext(length(f), "observation", "observations")))
4142
} else if (is.factor(f)) {
42-
m <- sort(table(p$fitted), decreasing = TRUE)[1]
43+
m <- sort(table(x[["fitted"]]), decreasing = TRUE)[1]
4344
message(paste0("Modal prediction: ", shQuote(names(m)), " for ", m, " of ", length(f), " ",
4445
ngettext(length(f), "observation", "observations"),
4546
" with total ", nlevels(f), " ", ngettext(nlevels(f), "level", "levels") ))

R/prediction_methods.R

+1-1
Original file line numberDiff line numberDiff line change
@@ -218,5 +218,5 @@ prediction.polr <- function(model, data, ...) {
218218
class = c("prediction", "data.frame"),
219219
row.names = seq_len(length(pred[["fit"]])),
220220
model.class = class(model),
221-
type = type)
221+
type = NULL)
222222
}

README.Rmd

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ By comparison, R has no robust functionality in the base tools for drawing out m
2929

3030
Given the challenges of interpreting the contribution of a given regressor in any model that includes quadratic terms, multiplicative interactions, a non-linear transformation, or other complexities, there is a clear need for a simple, consistent way to estimate marginal effects for popular statistical models. This package aims to correctly calculate marginal effects that include complex terms and provide a uniform interface for doing those calculations. Thus, the package implements a single S3 generic method (`margins()`) that can be easily generalized for any type of model implemented in R.
3131

32-
Some technical details of the package are worth briefly noting. The estimation of marginal effects relies on numeric derivatives produced using `predict()` and [`numDeriv::grad()`](https://cran.r-project.org/package=numDeriv). While symbolic differentiation of some models (e.g., linear models) is possible using `D()` and `deriv()`, R's modelling language (the "formula" class) is sufficiently general to enable the construction of model formulae that contain terms that fall outside of R's symbolic differentiation rule table (e.g., `y ~ factor(x)` or `y ~ I(FUN(x))` for any arbitrary `FUN()`). By relying on numeric differentiation, `margins()` supports *any* model that can be expressed in R formula syntax. Even Stata's `margins` command is limited in its ability to handle variable transformations (e.g., including `x` and `log(x)` as predictors) and quadratic terms (e.g., `x^3`); these scenarios are easily expressed in an R formula and easily handled, correctly, by `margins()`.
32+
Some technical details of the package are worth briefly noting. The estimation of marginal effects relies on numeric derivatives produced using `predict()` and a numerical approximation of [the Jacobian matrix](https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant). While symbolic differentiation of some models (e.g., linear models) is possible using `D()` and `deriv()`, R's modelling language (the "formula" class) is sufficiently general to enable the construction of model formulae that contain terms that fall outside of R's symbolic differentiation rule table (e.g., `y ~ factor(x)` or `y ~ I(FUN(x))` for any arbitrary `FUN()`). By relying on numeric differentiation, `margins()` supports *any* model that can be expressed in R formula syntax. Even Stata's `margins` command is limited in its ability to handle variable transformations (e.g., including `x` and `log(x)` as predictors) and quadratic terms (e.g., `x^3`); these scenarios are easily expressed in an R formula and easily handled, correctly, by `margins()`.
3333

3434
## Simple code examples ##
3535

README.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ By comparison, R has no robust functionality in the base tools for drawing out m
2929

3030
Given the challenges of interpreting the contribution of a given regressor in any model that includes quadratic terms, multiplicative interactions, a non-linear transformation, or other complexities, there is a clear need for a simple, consistent way to estimate marginal effects for popular statistical models. This package aims to correctly calculate marginal effects that include complex terms and provide a uniform interface for doing those calculations. Thus, the package implements a single S3 generic method (`margins()`) that can be easily generalized for any type of model implemented in R.
3131

32-
Some technical details of the package are worth briefly noting. The estimation of marginal effects relies on numeric derivatives produced using `predict()` and [`numDeriv::grad()`](https://cran.r-project.org/package=numDeriv). While symbolic differentiation of some models (e.g., linear models) is possible using `D()` and `deriv()`, R's modelling language (the "formula" class) is sufficiently general to enable the construction of model formulae that contain terms that fall outside of R's symbolic differentiation rule table (e.g., `y ~ factor(x)` or `y ~ I(FUN(x))` for any arbitrary `FUN()`). By relying on numeric differentiation, `margins()` supports *any* model that can be expressed in R formula syntax. Even Stata's `margins` command is limited in its ability to handle variable transformations (e.g., including `x` and `log(x)` as predictors) and quadratic terms (e.g., `x^3`); these scenarios are easily expressed in an R formula and easily handled, correctly, by `margins()`.
32+
Some technical details of the package are worth briefly noting. The estimation of marginal effects relies on numeric derivatives produced using `predict()` and a numerical approximation of [the Jacobian matrix](https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant). While symbolic differentiation of some models (e.g., linear models) is possible using `D()` and `deriv()`, R's modelling language (the "formula" class) is sufficiently general to enable the construction of model formulae that contain terms that fall outside of R's symbolic differentiation rule table (e.g., `y ~ factor(x)` or `y ~ I(FUN(x))` for any arbitrary `FUN()`). By relying on numeric differentiation, `margins()` supports *any* model that can be expressed in R formula syntax. Even Stata's `margins` command is limited in its ability to handle variable transformations (e.g., including `x` and `log(x)` as predictors) and quadratic terms (e.g., `x^3`); these scenarios are easily expressed in an R formula and easily handled, correctly, by `margins()`.
3333

3434
## Simple code examples ##
3535

@@ -85,7 +85,7 @@ microbenchmark(marginal_effects(x))
8585
```
8686
## Unit: milliseconds
8787
## expr min lq mean median uq max neval
88-
## marginal_effects(x) 7.942455 8.409444 9.199376 9.115538 9.792552 13.37642 100
88+
## marginal_effects(x) 8.047778 8.400351 9.020149 8.620828 9.537136 12.67893 100
8989
```
9090

9191
```r
@@ -94,8 +94,8 @@ microbenchmark(margins(x))
9494

9595
```
9696
## Unit: milliseconds
97-
## expr min lq mean median uq max neval
98-
## margins(x) 63.40545 69.59018 74.86074 72.59378 75.9127 169.0655 100
97+
## expr min lq mean median uq max neval
98+
## margins(x) 63.93785 67.05056 74.14901 70.09276 76.57553 180.7601 100
9999
```
100100

101101
In addition to the estimation procedures and `plot()` generic, **margins** offers several plotting methods for model objects. First, there is a new generic `cplot()` that displays predictions or marginal effects (from an "lm" or "glm" model) of a variable conditional across values of third variable (or itself). For example, here is a graph of predicted probabilities from a logit model:
Binary file not shown.

man/build_margins.Rd

+2-5
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/marginal_effects.Rd

+3-3
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/margins.Rd

+1-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)