Binary outcome, continuous treatment #377

lucasqcdh · 2022-02-08T13:10:29Z

Hi there,

Thanks so much for this great package! I'm looking into a problem in which my treatment is continuous and my outcome is binary, which is I think the exact opposite of most of the use-cases I find in literature, examples and documentation.

In fact, the linear_dataset has indeed the exact opposite as defaults defined:

dowhy.datasets.linear_dataset(beta, ..., treatment_is_binary=True, outcome_is_binary=False, ...)

But this just on the side, creating a dataset is not the issue here :)

So I was hoping to be able to use Double Machine Learning for this, and came across this github issue on EconML: py-why/EconML#204

Although I can't follow the full detail of the replies, it seems it's not straight forward to use DML for this use-case.

I also came across this in the documentation, but that seems to be only for binary treatment since the model_propensity should have a predict_proba method, see here

In any way, what kind of a model could I use best? Perhaps the Generalized Linear Model, see here

I found these two github issues related to using logistic regression:

#163 with a basic implementation
and #296 with some doubts whether things are working correctly under the hood.

Any help or guidance is greatly appreciated 🙏

The text was updated successfully, but these errors were encountered:

amit-sharma · 2022-02-14T05:56:39Z

@lucasqcdh actually binary outcome is one of the most popular cases in epidemiology and health! But I do agree our docs are thin on this usecase.

If you want to use DML, the suggestion in the EconML issue is to apply it on the scores between [0,1], not on the binary outcome. This is a common technique used by e.g., model explanation methods like SHAP. The problem changes to: How much does the treatment change the classification score (probability of class being 1)? Instead of change in the actual binary outcome.

If you'd like to stick to the binary outcome, here are your options:

Use any of the propensity score-based methods. Binary outcome $Y$ will be interpreted as numeric 1 or 0 and you will get reasonable results. But this won't work for a continuous treatment.
If treatment is continuous, your best bet is GLM with a logistic transformation.

Thanks for alerting to the open issue #296 on logistic. I will have a look at it this week and address it.

My personal opinion: I'd suggest you to use the classification score probability as your outcome and then use a method like DML. Two complications with using GML or another pure-prediction approach: 1) it may have bias due to putting all the confounders and treatment of interest together, 2) the estimate is quite unintuitive. if logistic or some other transformation needed to model the outcome, then you can no longer talk about additive effect: e.g., for logistic, you can no longer talk about the increase in outcome due to changing a treatment, rather you need to talk about log-odds or something like that.

lucasqcdh · 2022-02-15T13:06:05Z

Hi @amit-sharma, thank you very much for your time and extensive reply!

binary outcome is one of the most popular cases in epidemiology and health!

ah, good to know, thanks! (and excuse for my ignorance on the topic)

My personal opinion: I'd suggest you to use the classification score probability as your outcome and then use a method like DML

sounds good, will try!

tmorzade · 2025-02-02T01:49:18Z

Hello,

I am discovering DoWhy, and it sounds very interesting

I am facing a problem that consist in estimating the causal effect of a continuous treatment on a binary output

I fear that the backdoor.linear_regression just converts the boolean into 0 and 1, and considers it as a continuous quantity

I allow myself to comment this closed issue because the post it refers to, on DoWhy and EconML, are at least two years old

I would like to know what is the current best practice

ChatGPT suggest me to apply first a logistic regression, with treatment and confounders as features, to get scores and then apply back_door.linear_regression with scores as outcome

Is it the right way to solve this issue, please ?

Thank you in advance for your help,

lucasqcdh closed this as completed Feb 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Binary outcome, continuous treatment #377

Binary outcome, continuous treatment #377

lucasqcdh commented Feb 8, 2022

amit-sharma commented Feb 14, 2022 •

edited

Loading

lucasqcdh commented Feb 15, 2022

tmorzade commented Feb 2, 2025

Binary outcome, continuous treatment #377

Binary outcome, continuous treatment #377

Comments

lucasqcdh commented Feb 8, 2022

amit-sharma commented Feb 14, 2022 • edited Loading

lucasqcdh commented Feb 15, 2022

tmorzade commented Feb 2, 2025

amit-sharma commented Feb 14, 2022 •

edited

Loading