GCM#
- class pgmpy.ci_tests.GCM(data: DataFrame, estimator=None)[source]#
Bases:
_BaseCITestGeneralized Covariance Measure (GCM) [1] test for conditional independence.
Fit an estimator on \(X\) and \(Y\) on \([1, Z]\), let \(r_X\) and \(r_Y\) denote the resulting residuals, and define \(U_i = r_{X, i} r_{Y, i}\). The resulting test statistic is
\[T = \frac{1}{\sqrt{n}} \frac{\sum_{i=1}^n U_i}{\operatorname{std}(U_1, \ldots, U_n)},\]where \(n\) is the sample size. Under the null hypothesis \(X \perp Y \mid Z\), this statistic is asymptotically standard normal.
- Parameters:
- datapandas.DataFrame
The dataset in which to test the independence condition.
- estimator: optional (default=None)
Any regressor with fit and predict methods to compute residuals. If None, LinearRegression() is used as default.
- Attributes:
- statistic_float
The GCM test statistic. Set after calling the test.
- p_value_float
The p-value for the test. Set after calling the test.
References
[1]Rajen D. Shah, and Jonas Peters. “The Hardness of Conditional Independence Testing and the Generalised Covariance Measure”.