-
Notifications
You must be signed in to change notification settings - Fork 943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binary outcome, continuous treatment #377
Comments
@lucasqcdh actually binary outcome is one of the most popular cases in epidemiology and health! But I do agree our docs are thin on this usecase. If you want to use DML, the suggestion in the EconML issue is to apply it on the scores between [0,1], not on the binary outcome. This is a common technique used by e.g., model explanation methods like SHAP. The problem changes to: How much does the treatment change the classification score (probability of class being 1)? Instead of change in the actual binary outcome. If you'd like to stick to the binary outcome, here are your options:
Thanks for alerting to the open issue #296 on logistic. I will have a look at it this week and address it. My personal opinion: I'd suggest you to use the classification score probability as your outcome and then use a method like DML. Two complications with using GML or another pure-prediction approach: 1) it may have bias due to putting all the confounders and treatment of interest together, 2) the estimate is quite unintuitive. if logistic or some other transformation needed to model the outcome, then you can no longer talk about additive effect: e.g., for logistic, you can no longer talk about the increase in outcome due to changing a treatment, rather you need to talk about log-odds or something like that. |
Hi @amit-sharma, thank you very much for your time and extensive reply!
ah, good to know, thanks! (and excuse for my ignorance on the topic)
sounds good, will try! |
Hello, I am discovering DoWhy, and it sounds very interesting I am facing a problem that consist in estimating the causal effect of a continuous treatment on a binary output I fear that the backdoor.linear_regression just converts the boolean into 0 and 1, and considers it as a continuous quantity I allow myself to comment this closed issue because the post it refers to, on DoWhy and EconML, are at least two years old I would like to know what is the current best practice ChatGPT suggest me to apply first a logistic regression, with treatment and confounders as features, to get scores and then apply back_door.linear_regression with scores as outcome Is it the right way to solve this issue, please ? Thank you in advance for your help, |
Hi there,
Thanks so much for this great package! I'm looking into a problem in which my treatment is continuous and my outcome is binary, which is I think the exact opposite of most of the use-cases I find in literature, examples and documentation.
In fact, the
linear_dataset
has indeed the exact opposite as defaults defined:But this just on the side, creating a dataset is not the issue here :)
So I was hoping to be able to use Double Machine Learning for this, and came across this github issue on EconML: py-why/EconML#204
Although I can't follow the full detail of the replies, it seems it's not straight forward to use DML for this use-case.
I also came across this in the documentation, but that seems to be only for binary treatment since the model_propensity should have a
predict_proba
method, see hereIn any way, what kind of a model could I use best? Perhaps the Generalized Linear Model, see here
I found these two github issues related to using logistic regression:
#163 with a basic implementation
and #296 with some doubts whether things are working correctly under the hood.
Any help or guidance is greatly appreciated 🙏
The text was updated successfully, but these errors were encountered: