LogLikelihood#

class pgmpy.ci_tests.LogLikelihood(data: DataFrame)[source]#

Bases: PowerDivergence

Log-likelihood ratio test for conditional independence on discrete data.

This class is a thin specialization of PowerDivergence with lambda_="log-likelihood". In this implementation it is equivalent to GSq. For the contingency-table construction, conditional-case aggregation, and p-value computation, see PowerDivergence.

Parameters:

datapandas.DataFrame: The dataset on which to test the independence condition.

Attributes:

statistic_float: The log-likelihood ratio (G-squared) test statistic. Set after calling the test.
p_value_float: The p-value for the test. Set after calling the test.
dof_int: Degrees of freedom for the test. Set after calling the test.

References

[1]

https://en.wikipedia.org/wiki/G-test

Examples

>>> import pandas as pd
>>> import numpy as np
>>> np.random.seed(42)
>>> data = pd.DataFrame(
...     data=np.random.randint(low=0, high=2, size=(50000, 4)), columns=list("ABCD")
... )
>>> data["E"] = data["A"] + data["B"] + data["C"]
>>> test = LogLikelihood(data=data)
>>> test(X="A", Y="C", Z=[], significance_level=0.05)
np.True_
>>> round(test.statistic_, 2)
np.float64(0.03)
>>> round(test.p_value_, 2)
np.float64(0.86)
>>> test.dof_
1
>>> test(X="A", Y="B", Z=["D"], significance_level=0.05)
np.True_
>>> test(X="A", Y="B", Z=["D", "E"], significance_level=0.05)
np.False_