ExpertKnowledge#

class pgmpy.causal_discovery.ExpertKnowledge(forbidden_edges=None, required_edges=None, temporal_order=None, search_space=None, **kwargs)[source]#

Bases: object

Class to specify expert knowledge for causal discovery / structure learning algorithms.

Expert knowledge is the prior knowledge about edges in the final structure of the graph learned by causal discovery algorithms. Users can provide information about edges that have to be present/absent in the final learned graph and the temporal / causal ordering of the variables.

Parameters:
forbidden_edges: iterable (default: None)

The set of directed edges that are to be absent in the final graph structure. Refer to the algorithm documentation for details on how the argument is handled.

required_edges: iterable (default: None)

The set of directed edges that are to be present in the final graph structure. Refer to the algorithm documentation for details on how the argument is handled.

search_space: iterable (default: None)

The set of directed edges that form the search space for the structure learning algorithm (a white list of all possible edges). Refer to the algorithm documentation for details on how the argument is handled.

temporal_order: iterator (default: None)

The temporal ordering of variables according to prior knowledge. Each list/structure in the (2 dimensional) iterator contains variables with the same temporal significance; the more prior (parental) variables are at the start while the priority decreases as we go move towards the end of the structure (iterator).

Examples

Import an example model from pgmpy.utils

>>> from pgmpy.example_models import load_model
>>> from pgmpy.estimators import ExpertKnowledge, PC
>>> from pgmpy.sampling import BayesianModelSampling
>>> asia_model = load_model("bnlearn/asia")
>>> cancer_model = load_model("bnlearn/cancer")

Required and forbidden edges

>>> forb_edges = [("tub", "asia"), ("lung", "smoke")]
>>> req_edges = [("smoke", "bronc")]
>>> expert_knowledge = ExpertKnowledge(
...     required_edges=req_edges, forbidden_edges=forb_edges
... )

Use during structure learning

>>> data = BayesianModelSampling(asia_model).forward_sample(size=int(1e4))
>>> est = PC(data)
>>> est.estimate(
...     variant="stable",
...     expert_knowledge=expert_knowledge,
...     show_progress=False,
... )
<pgmpy.base.DAG.PDAG object at 0x...>

Temporal order

>>> expert_knowledge = ExpertKnowledge(
...     temporal_order=[["Pollution", "Smoker"], ["Cancer"], ["Dyspnoea", "Xray"]]
... )

Use during structure learning

>>> data = cancer_model.simulate(n_samples=int(1e4))
>>> est = PC(data)
>>> est.estimate(
...     variant="stable",
...     expert_knowledge=expert_knowledge,
...     show_progress=False,
... )
<pgmpy.base.DAG.PDAG object at 0x...>
apply_expert_knowledge(pdag)[source]#

Method to check consistency and orient edges in a graph based on expert knowledge.

The required and forbidden edges, if specified by the user, are correctly oriented in the graph object passed. Temporal order, as specified, is also taken into account. In case of any conflict between the graph structure and a required/forbidden edge, the edge is ignored and a warning is raised.

Parameters:
pdag: pgmpy.base.PDAG

A partial DAG with directed and undirected edges.

Returns:
Model after edge orientation: pgmpy.base.DAG

The partial DAG after accounting for specified required and forbidden edges.

References

[1] https://doi.org/10.48550/arXiv.2306.01638

limit_search_space(data_coulumn_labels)[source]#

Forms an additive set of forbidden edges by subtracting the search space from the set of all possible edges.

Parameters:
data_coulumn_labels: set | list | pd.DataFrame.columns

Set of edges to be used for structure learning. If None, all possible edges are used.

Returns:
forbidden_edges_additive: set

Set of edges that are not allowed in the structure.