Causal Estimation#
Causal estimation turns an identified causal query into a numerical effect. Once you know that a causal effect is identifiable (see Causal Identification), this step combines the graph and observed data to compute quantities like the average treatment effect or an interventional distribution.
Note
Prerequisite: This guide assumes a causal effect has been identified. See Causal Identification to determine identifiability first.
Tip
When to use this vs. Probabilistic Inference: Use Causal Estimation when you need interventional answers (do-calculus, “what if I set X to x?”). Use Probabilistic Inference for observational queries (conditioning on evidence, “what is P(Y | X=x)?”).
At a Glance#
Unified API: Two entry points —
CausalInferencefor graph-based queries, andpgmpy.predictionfor sklearn-style regression.Interventional Queries: Compute distributions under do-interventions on fitted discrete models.
Average Treatment Effect: Estimate ATE from data using identified adjustment sets.
Regression-Based Estimators: Sklearn-compatible regressors for adjustment, instrumental variable, and double ML workflows.
API#
pgmpy provides two consistent entry points:
Graph-based queries using CausalInference:
import numpy as np
import pandas as pd
from pgmpy.base import DAG
from pgmpy.inference import CausalInference
rng = np.random.default_rng(42)
data = pd.DataFrame(
{
"X": rng.normal(size=500),
"Z": rng.normal(size=500),
"Y": rng.normal(size=500),
}
)
graph = DAG([("X", "Y"), ("Z", "X"), ("Z", "Y")])
ci = CausalInference(graph)
ate = ci.estimate_ate("X", "Y", data=data, estimator_type="linear")
print(round(float(ate), 4))
Sklearn-style regressors using pgmpy.prediction estimators that use causal graph
role annotations (exposures, outcomes, adjustment sets) to drive the estimation.
Interventional Queries#
For fitted discrete models, CausalInference.query(..., do=..., evidence=...) computes
interventional distributions such as P(Y | do(X), Z), letting you answer “what if”
questions directly from the model:
from pgmpy.example_models import load_model
from pgmpy.inference import CausalInference
model = load_model("bnlearn/alarm")
ci = CausalInference(model)
result = ci.query(
variables=["HISTORY"],
do={"CVP": "LOW"},
evidence={"PCWP": "LOW"},
show_progress=False,
)
print(result)
Average Treatment Effect#
CausalInference.estimate_ate(...) estimates the average treatment effect of an exposure
on an outcome, automatically using identified adjustment sets from the causal graph.
Regression-Based Estimators#
pgmpy provides sklearn-compatible regressors that use causal graph roles to determine which variables play which role in the estimation. These include adjustment-based regression, instrumental variable regression, and cross-fitted double machine learning for flexible nuisance models:
import numpy as np
import pandas as pd
from pgmpy.base import DAG
from pgmpy.prediction import NaiveAdjustmentRegressor
rng = np.random.default_rng(42)
Z = rng.normal(size=500)
X = 0.5 * Z + rng.normal(size=500)
Y = 0.8 * X + 0.3 * Z + rng.normal(size=500)
data = pd.DataFrame({"X": X, "Z": Z, "Y": Y})
graph = DAG(
[("X", "Y"), ("Z", "X"), ("Z", "Y")],
roles={"exposures": "X", "outcomes": "Y", "adjustment": "Z"},
)
reg = NaiveAdjustmentRegressor(causal_graph=graph)
reg.fit(data[["X", "Z"]], data["Y"])
print(reg.predict(data[["X", "Z"]])[:5])
See Also#
See also
Causal Identification — Verify identifiability before estimation.
API Reference#
For the full list of estimation methods: