PearsonrEquivalence#
- class pgmpy.ci_tests.PearsonrEquivalence(data: DataFrame, delta_threshold: float = 0.1)[source]#
Bases:
PearsonrPearson equivalence test [1] for conditional independence on continuous data.
This test first computes the partial correlation coefficient \(\hat{\rho}_{XY \mid Z}\) using
Pearsonr. Let \(\delta\) denotedelta_threshold. The Fisher transform is computed as:\[z_\rho = \operatorname{arctanh}(\hat{\rho}_{XY \mid Z}), \qquad z_\delta = \operatorname{arctanh}(\delta),\]and defines
\[c = \sqrt{n - |Z| - 3},\]where \(n\) is the sample size and \(|Z|\) is the number of conditioning variables.
The test then performs a TOST (two one-sided tests) procedure for the equivalence hypothesis
\[H_0: \rho_{XY \mid Z} \leq -\delta \;\; \text{or} \;\; \rho_{XY \mid Z} \geq \delta \qquad \text{vs.} \qquad H_1: -\delta < \rho_{XY \mid Z} < \delta.\]The two one-sided test statistics are:
\[T_{\mathrm{lower}} = c (z_\rho + z_\delta), \qquad T_{\mathrm{upper}} = c (z_\rho - z_\delta),\]with corresponding p-values:
\[p_{\mathrm{lower}} = 1 - \Phi(T_{\mathrm{lower}}), \qquad p_{\mathrm{upper}} = \Phi(T_{\mathrm{upper}}),\]where \(\Phi\) is the standard normal CDF. The reported p-value is:
\[p = \max(p_{\mathrm{lower}}, p_{\mathrm{upper}}).\]- Parameters:
- datapandas.DataFrame
The dataset in which to test the independence condition.
- delta_thresholdfloat
The equivalence bound (threshold for practical independence).
- Attributes:
- statistic_float
Fisher z-transformed correlation coefficient \(z_\rho\). Set after calling the test.
- p_value_float
The p-value from the TOST procedure. Independence is concluded when
p_value_ < significance_level(opposite of standard CI tests). Set after calling the test.
References
[1]Malinsky, Daniel. “A cautious approach to constraint-based causal model selection.” arXiv preprint arXiv:2404.18232 (2024).
- is_independent(X: str, Y: str, Z: list | tuple = (), significance_level: float = 0.05) bool[source]#
Perform the equivalence CI test.
Note: Independence is concluded when p_value_ < significance_level (rejecting the null of dependence), which is the OPPOSITE of standard CI tests.
- Returns:
- bool
True if X ⊥⊥ Y | Z (p_value_ < significance_level), else False.