Base Structure Classes¶

Directed Acyclic Graph (DAG)¶

class pgmpy.base.DAG.DAG(ebunch: Iterable[tuple[Hashable, Hashable]] | None = None, latents: set[Hashable] = {})[source]¶

Base class for all Directed Graphical Models.

Each node in the graph can represent either a random variable, Factor, or a cluster of random variables. Edges in the graph represent the dependencies between these.

Parameters:: data (input graph) – Data to initialize graph. If data=None (default) an empty graph is created. The data can be an edge list or any Networkx graph object.

Examples

Create an empty DAG with no nodes and no edges

>>> from pgmpy.base import DAG
>>> G = DAG()

G can be grown in several ways:

Nodes:

Add one node at a time:

>>> G.add_node(node="a")

Add the nodes from any container (a list, set or tuple or the nodes from another graph).

>>> G.add_nodes_from(nodes=["a", "b"])

Edges:

G can also be grown by adding edges.

Add one edge,

>>> G.add_edge(u="a", v="b")

a list of edges,

>>> G.add_edges_from(ebunch=[("a", "b"), ("b", "c")])

If some edges connect nodes not yet in the model, the nodes are added automatically. There are no errors when adding nodes or edges that already exist.

Shortcuts:

Many common graph features allow python syntax for speed reporting.

>>> "a" in G  # check if node in graph
True
>>> len(G)  # number of nodes in graph
3

active_trail_nodes(variables: list[Hashable] | Hashable, observed: Hashable | list[Hashable] | tuple[Hashable, Hashable] | None = None, include_latents=False) → dict[Hashable, set[Hashable]][source]¶

Returns a dictionary with the given variables as keys and all the nodes reachable from that respective variable as values.

Parameters:

variables (str or array like) – variables whose active trails are to be found.
observed (List of nodes (optional)) – If given the active trails would be computed assuming these nodes to be observed.
include_latents (boolean (default: False)) – Whether to include the latent variables in the returned active trail nodes.

Examples

>>> from pgmpy.base import DAG
>>> student = DAG()
>>> student.add_nodes_from(["diff", "intel", "grades"])
>>> student.add_edges_from([("diff", "grades"), ("intel", "grades")])
>>> student.active_trail_nodes("diff")
{'diff': {'diff', 'grades'}}
>>> student.active_trail_nodes(["diff", "intel"], observed="grades")
{'diff': {'diff', 'intel'}, 'intel': {'diff', 'intel'}}

References

Details of the algorithm can be found in ‘Probabilistic Graphical Model Principles and Techniques’ - Koller and Friedman Page 75 Algorithm 3.1

add_edge(u: Hashable, v: Hashable, weight: int | float | None = None)[source]¶

Add an edge between u and v.

The nodes u and v will be automatically added if they are not already in the graph.

Parameters:

u (nodes) – Nodes can be any hashable Python object.
v (nodes) – Nodes can be any hashable Python object.
weight (int, float (default=None)) – The weight of the edge

Examples

>>> from pgmpy.base import DAG
>>> G = DAG()
>>> G.add_nodes_from(nodes=["Alice", "Bob", "Charles"])
>>> G.add_edge(u="Alice", v="Bob")
>>> G.nodes()
NodeView(('Alice', 'Bob', 'Charles'))
>>> G.edges()
OutEdgeView([('Alice', 'Bob')])

When the node is not already present in the graph:

>>> G.add_edge(u="Alice", v="Ankur")
>>> G.nodes()
NodeView(('Alice', 'Ankur', 'Bob', 'Charles'))
>>> G.edges()
OutEdgeView([('Alice', 'Bob'), ('Alice', 'Ankur')])

Adding edges with weight:

>>> G.add_edge("Ankur", "Maria", weight=0.1)
>>> G.edge["Ankur"]["Maria"]
{'weight': 0.1}

add_edges_from(ebunch: Iterable[tuple[Hashable, Hashable]], weights: list[float] | tuple[float] | None = None)[source]¶

Add all the edges in ebunch.

If nodes referred in the ebunch are not already present, they will be automatically added. Node names can be any hashable python object.

**The behavior of adding weights is different than networkx.

Parameters:

ebunch (container of edges) – Each edge given in the container will be added to the graph. The edges must be given as 2-tuples (u, v).
weights (list, tuple (default=None)) – A container of weights (int, float). The weight value at index i is associated with the edge at index i.

Examples

>>> from pgmpy.base import DAG
>>> G = DAG()
>>> G.add_nodes_from(nodes=["Alice", "Bob", "Charles"])
>>> G.add_edges_from(ebunch=[("Alice", "Bob"), ("Bob", "Charles")])
>>> G.nodes()
NodeView(('Alice', 'Bob', 'Charles'))
>>> G.edges()
OutEdgeView([('Alice', 'Bob'), ('Bob', 'Charles')])

When the node is not already in the model:

>>> G.add_edges_from(ebunch=[("Alice", "Ankur")])
>>> G.nodes()
NodeView(('Alice', 'Bob', 'Charles', 'Ankur'))
>>> G.edges()
OutEdgeView([('Alice', 'Bob'), ('Bob', 'Charles'), ('Alice', 'Ankur')])

Adding edges with weights:

>>> G.add_edges_from(
...     [("Ankur", "Maria"), ("Maria", "Mason")], weights=[0.3, 0.5]
... )
>>> G.edge["Ankur"]["Maria"]
{'weight': 0.3}
>>> G.edge["Maria"]["Mason"]
{'weight': 0.5}

or

>>> G.add_edges_from([("Ankur", "Maria", 0.3), ("Maria", "Mason", 0.5)])

add_node(node: Hashable, weight: float | None = None, latent: bool = False)[source]¶

Adds a single node to the Graph.

Parameters:

node (str, int, or any hashable python object.) – The node to add to the graph.
weight (int, float) – The weight of the node.
latent (boolean (default: False)) – Specifies whether the variable is latent or not.

Examples

>>> from pgmpy.base import DAG
>>> G = DAG()
>>> G.add_node(node="A")
>>> sorted(G.nodes())
['A']

Adding a node with some weight.

>>> G.add_node(node="B", weight=0.3)

The weight of these nodes can be accessed as:

>>> G.nodes["B"]
{'weight': 0.3}
>>> G.nodes["A"]
{'weight': None}

add_nodes_from(nodes: Iterable[Hashable], weights: list[float] | tuple[float] | None = None, latent: Sequence[bool] | bool = False)[source]¶

Add multiple nodes to the Graph.

**The behviour of adding weights is different than in networkx.

Parameters:

nodes (iterable container) – A container (list, dict, set) of nodes (str, int or any hashable python object).
weights (list, tuple (default=None)) – A container of weights (int, float). The weight value at index i is associated with the variable at index i.
latent (bool, list, tuple (default=False)) – A container of boolean. The value at index i tells whether the node at index i is latent or not.

Examples

>>> from pgmpy.base import DAG
>>> G = DAG()
>>> G.add_nodes_from(nodes=["A", "B", "C"])
>>> G.nodes()
NodeView(('A', 'B', 'C'))

Adding nodes with weights:

>>> G.add_nodes_from(nodes=["D", "E"], weights=[0.3, 0.6])
>>> G.nodes["D"]
{'weight': 0.3}
>>> G.nodes["E"]
{'weight': 0.6}
>>> G.nodes["A"]
{'weight': None}

copy()[source]¶

Returns a copy of the graph.

The copy method by default returns an independent shallow copy of the graph and attributes. That is, if an attribute is a container, that container is shared by the original an the copy. Use Python’s copy.deepcopy for new containers.

If as_view is True then a view is returned instead of a copy.

Notes

All copies reproduce the graph structure, but data attributes may be handled in different ways. There are four types of copies of a graph that people might want.

Deepcopy – A “deepcopy” copies the graph structure as well as all data attributes and any objects they might contain. The entire graph object is new so that changes in the copy do not affect the original object. (see Python’s copy.deepcopy)

Data Reference (Shallow) – For a shallow copy the graph structure is copied but the edge, node and graph attribute dicts are references to those in the original graph. This saves time and memory but could cause confusion if you change an attribute in one graph and it changes the attribute in the other. NetworkX does not provide this level of shallow copy.

Independent Shallow – This copy creates new independent attribute dicts and then does a shallow copy of the attributes. That is, any attributes that are containers are shared between the new graph and the original. This is exactly what dict.copy() provides. You can obtain this style copy using:

>>> G = nx.path_graph(5)
>>> H = G.copy()
>>> H = G.copy(as_view=False)
>>> H = nx.Graph(G)
>>> H = G.__class__(G)

Fresh Data – For fresh data, the graph structure is copied while new empty data attribute dicts are created. The resulting graph is independent of the original and it has no edge, node or graph attributes. Fresh copies are not enabled. Instead use:

>>> H = G.__class__()
>>> H.add_nodes_from(G)
>>> H.add_edges_from(G.edges)

View – Inspired by dict-views, graph-views act like read-only versions of the original graph, providing a copy of the original structure without requiring any memory for copying the information.

See the Python copy module for more information on shallow and deep copies, https://docs.python.org/3/library/copy.html.

Parameters:: as_view (bool, optional (default=False)) – If True, the returned graph-view provides a read-only view of the original graph without actually copying any data.
Returns:: G – A copy of the graph.
Return type:: Graph

See also

to_directed: return a directed copy of the graph.

Examples

>>> G = nx.path_graph(4)  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> H = G.copy()

do(nodes: Hashable | Iterable[Hashable] | tuple[Hashable, Hashable], inplace=False)[source]¶

Applies the do operator to the graph and returns a new DAG with the transformed graph.

The do-operator, do(X = x) has the effect of removing all edges from the parents of X and setting X to the given value x.

Parameters:

nodes (list, array-like) – The names of the nodes to apply the do-operator for.
inplace (boolean (default: False)) – If inplace=True, makes the changes to the current object, otherwise returns a new instance.

Returns:

Modified DAG – A new instance of DAG modified by the do-operator

Return type:

pgmpy.base.DAG

Examples

Initialize a DAG

>>> graph = DAG()
>>> graph.add_edges_from([("X", "A"), ("A", "Y"), ("A", "B")])
>>> # Applying the do-operator will return a new DAG with the desired structure.
>>> graph_do_A = graph.do("A")
>>> # Which we can verify is missing the edges we would expect.
>>> graph_do_A.edges
OutEdgeView([('A', 'B'), ('A', 'Y')])

References

Causality: Models, Reasoning, and Inference, Judea Pearl (2000). p.70.

edge_strength(data, edges=None)[source]¶

Computes the strength of each edge in edges. The strength is bounded between 0 and 1, with 1 signifying strong effect.

The edge strength is defined as the effect size measure of a Conditional Independence test using the parents as the conditional set. The strength quantifies the effect of edge[0] on edge[1] after controlling for any other influence paths. We use a residualization-based CI test[1] to compute the strengths.

Interpretation: - The strength is the Pillai’s Trace effect size of partial correlation. - Measures the strength of linear relationship between the residuals. - Works for any mixture of categorical and continuous variables. - The value is bounded between 0 and 1: - Strength close to 1 → strong dependence. - Strength close to 0 → conditional independence.

Parameters:

data (pandas.DataFrame) – Dataset to compute edge strengths on.
edges (tuple, list, or None (default: None)) –
- None: Compute for all DAG edges.
- Tuple (X, Y): Compute for edge X → Y.
- List of tuples: Compute for selected edges.

Returns:

Dictionary mapping edges to their strength values.

Return type:

dict

Examples

>>> from pgmpy.models import LinearGaussianBayesianNetwork as LGBN
>>> # Create a linear Gaussian Bayesian network
>>> linear_model = LGBN([("X", "Y"), ("Z", "Y")])
>>> # Create CPDs with specific beta values
>>> x_cpd = LinearGaussianCPD(variable="X", beta=[0], std=1)
>>> y_cpd = LinearGaussianCPD(
...     variable="Y", beta=[0, 0.4, 0.6], std=1, evidence=["X", "Z"]
... )
>>> z_cpd = LinearGaussianCPD(variable="Z", beta=[0], std=1)
>>> # Add CPDs to the model
>>> linear_model.add_cpds(x_cpd, y_cpd, z_cpd)
>>> # Simulate data from the model
>>> data = linear_model.simulate(n_samples=int(1e4))
>>> # Create DAG and compute edge strengths
>>> dag = DAG([("X", "Y"), ("Z", "Y")])
>>> strengths = dag.edge_strength(data)
{('X', 'Y'): np.float64(0.14587166611282304),
 ('Z', 'Y'): np.float64(0.25683780900125613)}

References

[1] Ankan, Ankur, and Johannes Textor. “A simple unified approach to testing high-dimensional conditional independences for categorical and ordinal data.” Proceedings of the AAAI Conference on Artificial Intelligence.

fit(data, estimator=None, state_names=[], n_jobs=1, **kwargs) → DAG[source]¶

Estimates the CPD for each variable based on a given data set.

Parameters:

data (pandas DataFrame object) – DataFrame object with column names identical to the variable names of the network. (If some values in the data are missing the data cells should be set to numpy.nan. Note that pandas converts each column containing numpy.nan`s to dtype `float.)
estimator (Estimator class) – One of: - MaximumLikelihoodEstimator (default) - BayesianEstimator: In this case, pass ‘prior_type’ and either ‘pseudo_counts’ or ‘equivalent_sample_size’ as additional keyword arguments. See BayesianEstimator.get_parameters() for usage. - ExpectationMaximization
state_names (dict (optional)) – A dict indicating, for each variable, the discrete set of states that the variable can take. If unspecified, the observed values in the data set are taken to be the only possible states.
n_jobs (int (default: 1)) – Number of threads/processes to use for estimation. Using n_jobs > 1 for small models or datasets might be slower.

Returns:

Fitted Model – Returns a DiscreteBayesianNetwork object with learned CPDs. The DAG structure is preserved, and parameters (CPDs) are added. This allows the DAG to represent both the structure and the parameters of a Bayesian Network.

Return type:

DiscreteBayesianNetwork

Examples

>>> import pandas as pd
>>> from pgmpy.models import DiscreteBayesianNetwork
>>> from pgmpy.base import DAG
>>> data = pd.DataFrame(data={"A": [0, 0, 1], "B": [0, 1, 0], "C": [1, 1, 0]})
>>> model = DAG([("A", "C"), ("B", "C")])
>>> fitted_model = model.fit(data)
>>> fitted_model.get_cpds()
[<TabularCPD representing P(A:2) at 0x17945372c30>,
<TabularCPD representing P(B:2) at 0x17945a19760>,
<TabularCPD representing P(C:2 | A:2, B:2) at 0x17944f42690>]

classmethod from_dagitty(string=None, filename=None) → DAG[source]¶

Initializes a DAG instance using DAGitty syntax.

Creates a DAG from the dagitty string. If parameter beta is specified in the DAGitty string, the method returns a LinearGaussianBayesianNetwork instead of a plain DAG.

Parameters:

string (str (default: None)) – A DAGitty style multiline set of regression equation representing the model. Refer https://www.dagitty.net/manual-3.x.pdf#page=3.58 and https://github.com/jtextor/dagitty/blob/7a657776dc8f5e5ba4e323edb028e2c2aaf29327/gui/js/dagitty.js#L3417
filename (str (default: None)) – The filename of the file containing the model in DAGitty syntax.

Examples

>>> from pgmpy.base import DAG
>>> dag = DAG.from_dagitty(
...     "dag{'carry matches' [latent] cancer [outcome] smoking -> 'carry matches' [beta=0.2] smoking -> cancer [beta=0.5] 'carry matches' -> cancer }"
... )

Creating a Linear Gaussian Bayesian network from dagitty:

>>> from pgmpy.base import DAG
>>> from pgmpy.models import LinearGaussianBayesianNetwork as LGBN

# Specifying beta creates a LinearGaussianBayesianNetwork instance >>> dag = DAG.from_dagitty(“dag{X -> Y [beta=0.3] Y -> Z [beta=0.1]}”) >>> data = dag.simulate(n_samples=int(1e4))

>>> from pgmpy.base import DAG
>>> from pgmpy.models import LinearGaussianBayesianNetwork as LGBN

classmethod from_lavaan(string: str | None = None, filename: str | PathLike | None = None) → DAG[source]¶

Initializes a DAG instance using lavaan syntax.

Parameters:

string (str (default: None)) – A lavaan style multiline set of regression equation representing the model. Refer http://lavaan.ugent.be/tutorial/syntax1.html for details.
filename (str (default: None)) – The filename of the file containing the model in lavaan syntax.

Examples

get_ancestral_graph(nodes: Iterable[Hashable])[source]¶

Returns the ancestral graph of the given nodes. The ancestral graph only contains the nodes which are ancestors of at least one of the variables in node.

Parameters:: node (iterable) – List of nodes whose ancestral graph needs to be computed.
Returns:: Ancestral Graph
Return type:: pgmpy.base.DAG

Examples

>>> from pgmpy.base import DAG
>>> dag = DAG([("A", "C"), ("B", "C"), ("D", "A"), ("D", "B")])
>>> anc_dag = dag.get_ancestral_graph(nodes=["A", "B"])
>>> anc_dag.edges()
OutEdgeView([('D', 'A'), ('D', 'B')])

get_children(node: Hashable)[source]¶

Returns a list of children of node. Throws an error if the node is not present in the graph.

Parameters:: node (string, int or any hashable python object.) – The node whose children would be returned.

Examples

>>> from pgmpy.base import DAG
>>> g = DAG(
...     ebunch=[
...         ("A", "B"),
...         ("C", "B"),
...         ("B", "D"),
...         ("B", "E"),
...         ("B", "F"),
...         ("E", "G"),
...     ]
... )
>>> g.get_children(node="B")
['D', 'E', 'F']

get_immoralities() → dict[Hashable, list[tuple[Hashable, Hashable]]][source]¶

Finds all the immoralities in the model A v-structure X -> Z <- Y is an immorality if there is no direct edge between X and Y .

Returns:: Immoralities – A set of all the immoralities in the model
Return type:: set

Examples

>>> from pgmpy.base import DAG
>>> student = DAG()
>>> student.add_edges_from(
...     [
...         ("diff", "grade"),
...         ("intel", "grade"),
...         ("intel", "SAT"),
...         ("grade", "letter"),
...     ]
... )
>>> student.get_immoralities()
{('diff', 'intel')}

get_independencies(latex=False, include_latents=False) → Independencies | list[str][source]¶

Computes independencies in the DAG, by checking minimal d-seperation.

Parameters:

latex (boolean) – If latex=True then latex string of the independence assertion would be created.
include_latents (boolean) – If True, includes latent variables in the independencies. Otherwise, only generates independencies on observed variables.

Examples

>>> from pgmpy.base import DAG
>>> chain = DAG([("X", "Y"), ("Y", "Z")])
>>> chain.get_independencies()
(X ⟂ Z | Y)

get_leaves()[source]¶

Returns a list of leaves of the graph.

Examples

>>> from pgmpy.base import DAG
>>> graph = DAG([("A", "B"), ("B", "C"), ("B", "D")])
>>> graph.get_leaves()
['C', 'D']

get_markov_blanket(node: Hashable) → list[Hashable][source]¶

Returns a markov blanket for a random variable. In the case of Bayesian Networks, the markov blanket is the set of node’s parents, its children and its children’s other parents.

Returns:: Markov Blanket – List of nodes in the markov blanket of node.
Return type:: list
Parameters:: node (string, int or any hashable python object.) – The node whose markov blanket would be returned.

Examples

>>> from pgmpy.base import DAG
>>> from pgmpy.factors.discrete import TabularCPD
>>> G = DAG(
...     [
...         ("x", "y"),
...         ("z", "y"),
...         ("y", "w"),
...         ("y", "v"),
...         ("u", "w"),
...         ("s", "v"),
...         ("w", "t"),
...         ("w", "m"),
...         ("v", "n"),
...         ("v", "q"),
...     ]
... )
>>> G.get_markov_blanket("y")
['s', 'w', 'x', 'u', 'z', 'v']

get_parents(node: Hashable)[source]¶

Returns a list of parents of node.

Throws an error if the node is not present in the graph.

Parameters:: node (string, int or any hashable python object.) – The node whose parents would be returned.

Examples

>>> from pgmpy.base import DAG
>>> G = DAG(ebunch=[("diff", "grade"), ("intel", "grade")])
>>> G.get_parents(node="grade")
['diff', 'intel']

static get_random(n_nodes=5, edge_prob=0.5, node_names: list[Hashable] | None = None, latents=False, seed: int | None = None) → DAG[source]¶

Returns a randomly generated DAG with n_nodes number of nodes with edge probability being edge_prob.

Parameters:

n_nodes (int) – The number of nodes in the randomly generated DAG.
edge_prob (float) – The probability of edge between any two nodes in the topologically sorted DAG.
node_names (list (default: None)) – A list of variables names to use in the random graph. If None, the node names are integer values starting from 0.
latents (bool (default: False)) – If True, includes latent variables in the generated DAG.
seed (int (default: None)) – The seed for the random number generator.

Returns:

Random DAG – The randomly generated DAG.

Return type:

pgmpy.base.DAG

Examples

>>> from pgmpy.base import DAG
>>> random_dag = DAG.get_random(n_nodes=10, edge_prob=0.3)
>>> random_dag.nodes()
NodeView((0, 1, 2, 3, 4, 5, 6, 7, 8, 9))
>>> random_dag.edges()
OutEdgeView([(0, 6), (1, 6), (1, 7), (7, 9), (2, 5), (2, 7), (2, 8), (5, 9), (3, 7)])

get_roots()[source]¶

Returns a list of roots of the graph.

Examples

>>> from pgmpy.base import DAG
>>> graph = DAG([("A", "B"), ("B", "C"), ("B", "D"), ("E", "B")])
>>> graph.get_roots()
['A', 'E']

in_degree_iter(nbunch=None, weight=None)[source]¶

is_dconnected(start: Hashable, end: Hashable, observed: Sequence[Hashable] | None = None, include_latents=False)[source]¶

Returns True if there is an active trail (i.e. d-connection) between start and end node given that observed is observed.

Parameters:

start (int, str, any hashable python object.) – The nodes in the DAG between which to check the d-connection/active trail.
end (int, str, any hashable python object.) – The nodes in the DAG between which to check the d-connection/active trail.
observed (list, array-like (optional)) – If given the active trail would be computed assuming these nodes to be observed.
include_latents (boolean (default: False)) – If true, latent variables are return as part of the active trail.

Examples

>>> from pgmpy.base import DAG
>>> student = DAG()
>>> student.add_nodes_from(["diff", "intel", "grades", "letter", "sat"])
>>> student.add_edges_from(
...     [
...         ("diff", "grades"),
...         ("intel", "grades"),
...         ("grades", "letter"),
...         ("intel", "sat"),
...     ]
... )
>>> student.is_dconnected("diff", "intel")
False
>>> student.is_dconnected("grades", "sat")
True

is_iequivalent(model: DAG)[source]¶

Checks whether the given model is I-equivalent

Two graphs G1 and G2 are said to be I-equivalent if they have same skeleton and have same set of immoralities.

Parameters:: model (A DAG object, for which you want to check I-equivalence)
Returns:: I-equivalence – True if both are I-equivalent, False otherwise
Return type:: boolean

Examples

>>> from pgmpy.base import DAG
>>> G = DAG()
>>> G.add_edges_from([("V", "W"), ("W", "X"), ("X", "Y"), ("Z", "Y")])
>>> G1 = DAG()
>>> G1.add_edges_from([("W", "V"), ("X", "W"), ("X", "Y"), ("Z", "Y")])
>>> G.is_iequivalent(G1)
True

local_independencies(variables: list[Hashable] | tuple[Hashable, ...] | str)[source]¶

Returns an instance of Independencies containing the local independencies of each of the variables.

Parameters:: variables (str or array like) – variables whose local independencies are to be found.

Examples

>>> from pgmpy.base import DAG
>>> student = DAG()
>>> student.add_edges_from(
...     [
...         ("diff", "grade"),
...         ("intel", "grade"),
...         ("grade", "letter"),
...         ("intel", "SAT"),
...     ]
... )
>>> ind = student.local_independencies("grade")
>>> ind
(grade ⟂ SAT | diff, intel)

minimal_dseparator(start: Hashable, end: Hashable, include_latents=False) → set[Hashable][source]¶

Finds the minimal d-separating set for start and end.

Parameters:

start (node) – The first node.
end (node) – The second node.
include_latents (boolean (default: False)) – If true, latent variables are consider for minimal d-seperator.

Examples

>>> dag = DAG([("A", "B"), ("B", "C")])
>>> dag.minimal_dseparator(start="A", end="C")
{'B'}

References

[1] Algorithm 4, Page 10: Tian, Jin, Azaria Paz, and

Judea Pearl. Finding minimal d-separators. Computer Science Department,: University of California, 1998.

moralize()[source]¶

Removes all the immoralities in the DAG and creates a moral graph (UndirectedGraph).

A v-structure X->Z<-Y is an immorality if there is no directed edge between X and Y.

Examples

>>> from pgmpy.base import DAG
>>> G = DAG(ebunch=[("diff", "grade"), ("intel", "grade")])
>>> moral_graph = G.moralize()
>>> moral_graph.edges()
EdgeView([('intel', 'grade'), ('intel', 'diff'), ('grade', 'diff')])

out_degree_iter(nbunch=None, weight=None)[source]¶

to_daft(node_pos: str | dict[Hashable, tuple[int, int]] = 'circular', latex=True, pgm_params={}, edge_params={}, node_params={}, plot_edge_strength=False)[source]¶

Returns a daft (https://docs.daft-pgm.org/en/latest/) object which can be rendered for publication quality plots. The returned object’s render method can be called to see the plots.

Parameters:

node_pos (str or dict (default: circular)) –

If str: Must be one of the following: circular, kamada_kawai, planar, random, shell, sprint,

spectral, spiral. Please refer:

https://networkx.org/documentation/stable//reference/drawing.html#module-networkx.drawing.layout
for details on these layouts.

If dict should be of the form {node: (x coordinate, y coordinate)} describing the x and y coordinate of each node.

If no argument is provided uses circular layout.
latex (boolean) – Whether to use latex for rendering the node names.
pgm_params (dict (optional)) – Any additional parameters that need to be passed to daft.PGM initializer. Should be of the form: {param_name: param_value}
edge_params (dict (optional)) – Any additional edge parameters that need to be passed to daft.add_edge method. Should be of the form: {(u1, v1): {param_name: param_value}, (u2, v2): {…} }
node_params (dict (optional)) – Any additional node parameters that need to be passed to daft.add_node method. Should be of the form: {node1: {param_name: param_value}, node2: {…} }
plot_edge_strength (bool (default: False)) – If True, displays edge strength values as labels on edges. Requires edge strengths to be computed first using the edge_strength() method.

Returns:

Daft object – Daft object for plotting the DAG.

Return type:

daft.PGM object

Examples

>>> from pgmpy.base import DAG
>>> dag = DAG([("a", "b"), ("b", "c"), ("d", "c")])
>>> dag.to_daft(node_pos={"a": (0, 0), "b": (1, 0), "c": (2, 0), "d": (1, 1)})
<daft.PGM at 0x7fc756e936d0>
>>> dag.to_daft(node_pos="circular")
<daft.PGM at 0x7f9bb48c5eb0>
>>> dag.to_daft(node_pos="circular", pgm_params={"observed_style": "inner"})
<daft.PGM at 0x7f9bb48b0bb0>
>>> dag.to_daft(
...     node_pos="circular",
...     edge_params={("a", "b"): {"label": 2}},
...     node_params={"a": {"shape": "rectangle"}},
... )
<daft.PGM at 0x7f9bb48b0bb0>

to_graphviz(plot_edge_strength=False)[source]¶

Retuns a pygraphviz object for the DAG. pygraphviz is useful for visualizing the network structure.

Parameters:: plot_edge_strength (bool (default: False)) – If True, displays edge strength values as labels on edges. Requires edge strengths to be computed first using the edge_strength() method.
Returns:: AGraph object – pygraphviz object for plotting the DAG.
Return type:: pygraphviz.AGraph

Examples

>>> from pgmpy.utils import get_example_model
>>> model = get_example_model("alarm")
>>> model.to_graphviz()
<AGraph <Swig Object of type 'Agraph_t *' at 0x7fdea4cde040>>
>>> model.draw("model.png", prog="neato")

to_pdag()[source]¶

Returns the CPDAG (Completed Partial DAG) of the DAG representing the equivalence class that the given DAG belongs to.

Returns:: CPDAG – An instance of pgmpy.base.PDAG representing the CPDAG of the given DAG.
Return type:: pgmpy.base.PDAG

Examples

>>> from pgmpy.base import DAG
>>> dag = DAG([("A", "B"), ("B", "C"), ("C", "D")])
>>> pdag = dag.to_pdag()
>>> pdag.directed_edges
{('A', 'B'), ('B', 'C'), ('C', 'D')}

References

[1] Chickering, David Maxwell. “Learning equivalence classes of Bayesian-network structures.”: Journal of machine learning research 2.Feb (2002): 445-498. Figure 4 and 5.

class pgmpy.base.DAG.PDAG(directed_ebunch: list[tuple[Hashable, Hashable]] = [], undirected_ebunch: list[tuple[Hashable, Hashable]] = [], latents: Iterable[Hashable] = [])[source]¶

Class for representing PDAGs (also known as CPDAG). PDAGs are the equivalence classes of DAGs and contain both directed and undirected edges.

Note: In this class, undirected edges are represented using two edges in both direction i.e. an undirected edge between X - Y is represented using X -> Y and X <- Y.

all_neighbors(node)[source]¶

Returns a set of all neighbors of a node in the PDAG. This includes both directed and undirected edges.

Parameters:: node (any hashable python object) – The node for which to get the neighboring nodes.
Returns:: set
Return type:: A set of neighboring nodes.

Examples

>>> from pgmpy.base import PDAG
>>> pdag = PDAG(
...     directed_ebunch=[("A", "C"), ("D", "C")],
...     undirected_ebunch=[("B", "A"), ("B", "D")],
... )
>>> pdag.all_neighbors("A")
{'B', 'C'}

apply_meeks_rules(apply_r4=False, inplace=False, debug=False)[source]¶

Applies the Meek’s rules to orient the undirected edges of a PDAG to return a CPDAG.

Parameters:

apply_r4 (boolean (default=False)) – If True, applies Rules 1 - 4 of Meek’s rules. If False, applies only Rules 1 - 3.
inplace (boolean (default=False)) – If True, the PDAG object is modified inplace, otherwise a new modified copy is returned.
debug (boolean (default=False)) – If True, prints the rules being applied to the PDAG.

Returns:

None or pgmpy.base.PDAG – If inplace=True, returns None and the object itself is modified. If inplace=False, returns a PDAG object.

Return type:

The modified PDAG object.

Examples

>>> from pgmpy.base import PDAG
>>> pdag = PDAG(
...     directed_ebunch=[("A", "B")], undirected_ebunch=[("B", "C"), ("C", "B")]
... )
>>> pdag.apply_meeks_rules()
>>> pdag.directed_edges
{('A', 'B'), ('B', 'C')}

copy()[source]¶

Returns a copy of the object instance.

Returns:: Copy of PDAG – Returns a copy of self.
Return type:: pgmpy.dag.PDAG

directed_children(node)[source]¶: Returns a set of children of node such that there is a directed edge from node to child.

directed_parents(node)[source]¶: Returns a set of parents of node such that there is a directed edge from the parent to node.

has_directed_edge(u, v)[source]¶: Returns True if there is a directed edge u -> v in the PDAG.

has_undirected_edge(u, v)[source]¶: Returns True if there is an undirected edge u - v in the PDAG.

is_adjacent(u, v)[source]¶: Returns True if there is an edge between u and v. This can be either of u - v, u -> v, or u <- v.

orient_undirected_edge(u, v, inplace=False)[source]¶

Orients an undirected edge u - v as u -> v.

Parameters:

u (Any hashable python objects) – The node names.
v (Any hashable python objects) – The node names.
inplace (boolean (default=False)) – If True, the PDAG object is modified inplace, otherwise a new modified copy is returned.

Returns:

None or pgmpy.base.PDAG – If inplace=True, returns None and the object itself is modified. If inplace=False, returns a PDAG object.

Return type:

The modified PDAG object.

to_dag() → DAG[source]¶

Returns one possible DAG which is represented using the PDAG.

Returns:: pgmpy.base.DAG
Return type:: Returns an instance of DAG.

Examples

>>> pdag = PDAG(
...     directed_ebunch=[("A", "B"), ("C", "B")],
...     undirected_ebunch=[("C", "D"), ("D", "A")],
... )
>>> dag = pdag.to_dag()
>>> print(dag.edges())
OutEdgeView([('A', 'B'), ('C', 'B'), ('D', 'C'), ('A', 'D')])

References

[1] Dor, Dorit, and Michael Tarsi.

“A simple algorithm to construct a consistent extension of a partially oriented graph.”: Technicial Report R-185, Cognitive Systems Laboratory, UCLA (1992): 45.

to_graphviz() → object[source]¶

Retuns a pygraphviz object for the DAG. pygraphviz is useful for visualizing the network structure.

Examples

>>> from pgmpy.utils import get_example_model
>>> model = get_example_model("alarm")
>>> model.to_graphviz()
<AGraph <Swig Object of type 'Agraph_t *' at 0x7fdea4cde040>>

undirected_neighbors(node)[source]¶

Returns a set of neighboring nodes such that all of them have an undirected edge with node.

Parameters:: node (any hashable python object) – The node for which to get the undirected neighboring nodes.
Returns:: set
Return type:: A set of neighboring nodes.

Examples

>>> from pgmpy.base import PDAG
>>> pdag = PDAG(
...     directed_ebunch=[("A", "C"), ("D", "C")],
...     undirected_ebunch=[("B", "A"), ("B", "D")],
... )
>>> pdag.undirected_neighbors("A")
{'B'}

Partially Directed Acyclic Graph (PDAG or CPDAG)¶

Class for representing PDAGs (also known as CPDAG). PDAGs are the equivalence classes of DAGs and contain both directed and undirected edges.

Note: In this class, undirected edges are represented using two edges in both direction i.e. an undirected edge between X - Y is represented using X -> Y and X <- Y.

Base Structure Classes¶

Directed Acyclic Graph (DAG)¶

Partially Directed Acyclic Graph (PDAG or CPDAG)¶

Navigation

Related Topics