-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
numDeriv::jacobian is slow when applied to every observation #36
Comments
Definitely. If you're only interested in the effects (without their variances), you can use the I've been meaning to add a toggle to turn of the calculation of unit-specific variances (and, in fact, I think it should probably be off by default since it's probably not needed in most cases). I'll make that change shortly. |
Sounds great. Thanks for the |
Just pushed an update where this is set to FALSE by default - let me know what kind of performance improvements you see with it. |
Thanks! I'm not quite sure what the expectation should be here, as I've never used Stata's margins command, but this still seems unreasonably long to me (at least for my kind of practical use, where I often go back and forth between data munging and estimating). What's your expectation for reasonable time?
|
What are the specs on your machine? I have not seen times like that. |
That was on a 27" imac bought in 2014. This morning, on my linux machine (AMD FX(tm)-8350 Eight-Core Processor + 32GB RAM)
|
Thanks. I think #37 will produce some (hopefully substantial) performance enhancements, so please hold out for that. |
Sounds good. I guess I'm just not sure why you're using numerical approximations at all when the derivative is well know (at least in the logit case). |
Generality. If you scroll through the git history here, you'll see an approach using symbolic derivatives. It works in simple cases but not in many others, especially anytime any of the following occurs:
So, the choice is between an approach (symbolic derivatives) that only works in a set of cases for common models or an approach (numerical derivatives) that work for any formula and any model type. I think the latter is preferable because it will be possible to gradually optimize it once the code works as intended whereas the former approach simply won't work at all in lots of common cases. |
[insert thumbs up emoticon] |
Indeed, this looks pretty good: 43fe90e |
Yeah, that's a whole other ballgame:
|
nearly 800x faster |
Nice. |
Thanks for your work on this!
I was playing around with this cool package, and wondering why margins was so slow when applied to a small logistic glm with 100 observations. Basic profiling suggests that the function spends about 80% of its take taking the jacobian in the
delta_once
function.I can't quite find my way around the code yet, but it looks like it's calculating the variance for every observation, which may not be necessary in every use case.
Any thoughts on how to speed things up? If you point me to the right direction, I may be able to PR something.
The text was updated successfully, but these errors were encountered: