Bayesian Network¶
- class pgmpy.models.BayesianNetwork.BayesianNetwork(ebunch=None, latents={})[source]¶
Initializes a Bayesian Network. A models stores nodes and edges with conditional probability distribution (cpd) and other attributes.
models hold directed edges. Self loops are not allowed neither multiple (parallel) edges.
Nodes can be any hashable python object.
Edges are represented as links between nodes.
- Parameters:
ebunch (input graph) – Data to initialize graph. If ebunch=None (default) an empty graph is created. The ebunch can be an edge list, or any NetworkX graph object.
latents (list, array-like) – List of variables which are latent (i.e. unobserved) in the model.
Examples
Create an empty Bayesian Network with no nodes and no edges.
>>> from pgmpy.models import BayesianNetwork >>> G = BayesianNetwork()
G can be grown in several ways.
Nodes:
Add one node at a time:
>>> G.add_node('a')
Add the nodes from any container (a list, set or tuple or the nodes from another graph).
>>> G.add_nodes_from(['a', 'b'])
Edges:
G can also be grown by adding edges.
Add one edge,
>>> G.add_edge('a', 'b')
a list of edges,
>>> G.add_edges_from([('a', 'b'), ('b', 'c')])
If some edges connect nodes not yet in the model, the nodes are added automatically. There are no errors when adding nodes or edges that already exist.
Shortcuts:
Many common graph features allow python syntax for speed reporting.
>>> 'a' in G # check if node in graph True >>> len(G) # number of nodes in graph 3
- add_cpds(*cpds)[source]¶
Add CPD (Conditional Probability Distribution) to the Bayesian Model.
- Parameters:
cpds (list, set, tuple (array-like)) – List of CPDs which will be associated with the model
Examples
>>> from pgmpy.models import BayesianNetwork >>> from pgmpy.factors.discrete.CPD import TabularCPD >>> student = BayesianNetwork([('diff', 'grades'), ('aptitude', 'grades')]) >>> grades_cpd = TabularCPD('grades', 3, [[0.1,0.1,0.1,0.1,0.1,0.1], ... [0.1,0.1,0.1,0.1,0.1,0.1], ... [0.8,0.8,0.8,0.8,0.8,0.8]], ... evidence=['diff', 'aptitude'], evidence_card=[2, 3], ... state_names={'grades': ['gradeA', 'gradeB', 'gradeC'], ... 'diff': ['easy', 'hard'], ... 'aptitude': ['low', 'medium', 'high']}) >>> student.add_cpds(grades_cpd)
diff:
easy
hard
aptitude:
low
medium
high
low
medium
high
gradeA
0.1
0.1
0.1
0.1
0.1
0.1
gradeB
0.1
0.1
0.1
0.1
0.1
0.1
gradeC
0.8
0.8
0.8
0.8
0.8
0.8
- add_edge(u, v, **kwargs)[source]¶
Add an edge between u and v.
The nodes u and v will be automatically added if they are not already in the graph
- Parameters:
u (nodes) – Nodes can be any hashable python object.
v (nodes) – Nodes can be any hashable python object.
Examples
>>> from pgmpy.models import BayesianNetwork >>> G = BayesianNetwork() >>> G.add_nodes_from(['grade', 'intel']) >>> G.add_edge('grade', 'intel')
- check_model()[source]¶
Check the model for various errors. This method checks for the following errors.
Checks if the sum of the probabilities for each state is equal to 1 (tol=0.01).
Checks if the CPDs associated with nodes are consistent with their parents.
- Returns:
check – True if all the checks pass otherwise should throw an error.
- Return type:
boolean
- copy()[source]¶
Returns a copy of the model.
- Returns:
Model’s copy – Copy of the model on which the method was called.
- Return type:
pgmpy.models.BayesianNetwork
Examples
>>> from pgmpy.models import BayesianNetwork >>> from pgmpy.factors.discrete import TabularCPD >>> model = BayesianNetwork([('A', 'B'), ('B', 'C')]) >>> cpd_a = TabularCPD('A', 2, [[0.2], [0.8]]) >>> cpd_b = TabularCPD('B', 2, [[0.3, 0.7], [0.7, 0.3]], ... evidence=['A'], ... evidence_card=[2]) >>> cpd_c = TabularCPD('C', 2, [[0.1, 0.9], [0.9, 0.1]], ... evidence=['B'], ... evidence_card=[2]) >>> model.add_cpds(cpd_a, cpd_b, cpd_c) >>> copy_model = model.copy() >>> copy_model.nodes() NodeView(('A', 'B', 'C')) >>> copy_model.edges() OutEdgeView([('A', 'B'), ('B', 'C')]) >>> len(copy_model.get_cpds()) 3
- do(nodes, inplace=False)[source]¶
Applies the do operation. The do operation removes all incoming edges to variables in nodes and marginalizes their CPDs to only contain the variable itself.
- Parameters:
nodes (list, array-like) – The names of the nodes to apply the do-operator for.
inplace (boolean (default: False)) – If inplace=True, makes the changes to the current object, otherwise returns a new instance.
- Returns:
Modified network – If inplace=True, modifies the object itself else returns an instance of BayesianNetwork modified by the do operation.
- Return type:
pgmpy.models.BayesianNetwork or None
Examples
>>> from pgmpy.utils import get_example_model >>> asia = get_example_model('asia') >>> asia.edges() OutEdgeView([('asia', 'tub'), ('tub', 'either'), ('smoke', 'lung'), ('smoke', 'bronc'), ('lung', 'either'), ('bronc', 'dysp'), ('either', 'xray'), ('either', 'dysp')]) >>> do_bronc = asia.do(['bronc']) OutEdgeView([('asia', 'tub'), ('tub', 'either'), ('smoke', 'lung'), ('lung', 'either'), ('bronc', 'dysp'), ('either', 'xray'), ('either', 'dysp')])
- fit(data, estimator=None, state_names=[], n_jobs=1, **kwargs)[source]¶
Estimates the CPD for each variable based on a given data set.
- Parameters:
data (pandas DataFrame object) – DataFrame object with column names identical to the variable names of the network. (If some values in the data are missing the data cells should be set to numpy.nan. Note that pandas converts each column containing numpy.nan`s to dtype `float.)
estimator (Estimator class) – One of: - MaximumLikelihoodEstimator (default) - BayesianEstimator: In this case, pass ‘prior_type’ and either ‘pseudo_counts’ or ‘equivalent_sample_size’ as additional keyword arguments. See BayesianEstimator.get_parameters() for usage. - ExpectationMaximization
state_names (dict (optional)) – A dict indicating, for each variable, the discrete set of states that the variable can take. If unspecified, the observed values in the data set are taken to be the only possible states.
n_jobs (int (default: 1)) – Number of threads/processes to use for estimation. Using n_jobs > 1 for small models or datasets might be slower.
- Returns:
Fitted Model – Modifies the network inplace and adds the cpds property.
- Return type:
None
Examples
>>> import pandas as pd >>> from pgmpy.models import BayesianNetwork >>> from pgmpy.estimators import MaximumLikelihoodEstimator >>> data = pd.DataFrame(data={'A': [0, 0, 1], 'B': [0, 1, 0], 'C': [1, 1, 0]}) >>> model = BayesianNetwork([('A', 'C'), ('B', 'C')]) >>> model.fit(data) >>> model.get_cpds() [<TabularCPD representing P(A:2) at 0x7fb98a7d50f0>, <TabularCPD representing P(B:2) at 0x7fb98a7d5588>, <TabularCPD representing P(C:2 | A:2, B:2) at 0x7fb98a7b1f98>]
- fit_update(data, n_prev_samples=None, n_jobs=1)[source]¶
Method to update the parameters of the BayesianNetwork with more data. Internally, uses BayesianEstimator with dirichlet prior, and uses the current CPDs (along with n_prev_samples) to compute the pseudo_counts.
- Parameters:
data (pandas.DataFrame) – The new dataset which to use for updating the model.
n_prev_samples (int) – The number of samples/datapoints on which the model was trained before. This parameter determines how much weight should the new data be given. If None, n_prev_samples = nrow(data).
n_jobs (int (default: 1)) – Number of threads/processes to use for estimation. Using n_jobs > 1 for small models or datasets might be slower.
- Returns:
Updated model – Modifies the network inplace.
- Return type:
None
Examples
>>> from pgmpy.utils import get_example_model >>> from pgmpy.sampling import BayesianModelSampling >>> model = get_example_model('alarm') >>> # Generate some new data. >>> data = BayesianModelSampling(model).forward_sample(int(1e3)) >>> model.fit_update(data)
- get_cardinality(node=None)[source]¶
Returns the cardinality of the node. Throws an error if the CPD for the queried node hasn’t been added to the network.
- Parameters:
node (Any hashable python object(optional).) – The node whose cardinality we want. If node is not specified returns a dictionary with the given variable as keys and their respective cardinality as values.
- Returns:
variable cardinalities – If node is specified returns the cardinality of the node else returns a dictionary with the cardinality of each variable in the network
- Return type:
Examples
>>> from pgmpy.models import BayesianNetwork >>> from pgmpy.factors.discrete import TabularCPD >>> student = BayesianNetwork([('diff', 'grade'), ('intel', 'grade')]) >>> cpd_diff = TabularCPD('diff', 2, [[0.6], [0.4]]); >>> cpd_intel = TabularCPD('intel', 2, [[0.7], [0.3]]); >>> cpd_grade = TabularCPD('grade', 2, [[0.1, 0.9, 0.2, 0.7], ... [0.9, 0.1, 0.8, 0.3]], ... ['intel', 'diff'], [2, 2]) >>> student.add_cpds(cpd_diff,cpd_intel,cpd_grade) >>> student.get_cardinality() defaultdict(<class 'int'>, {'diff': 2, 'intel': 2, 'grade': 2})
>>> student.get_cardinality('intel') 2
- get_cpds(node=None)[source]¶
Returns the cpd of the node. If node is not specified returns all the CPDs that have been added till now to the graph
- Parameters:
node (any hashable python object (optional)) – The node whose CPD we want. If node not specified returns all the CPDs added to the model.
- Returns:
A list of TabularCPDs
- Return type:
Examples
>>> from pgmpy.models import BayesianNetwork >>> from pgmpy.factors.discrete import TabularCPD >>> student = BayesianNetwork([('diff', 'grade'), ('intel', 'grade')]) >>> cpd = TabularCPD('grade', 2, [[0.1, 0.9, 0.2, 0.7], ... [0.9, 0.1, 0.8, 0.3]], ... ['intel', 'diff'], [2, 2]) >>> student.add_cpds(cpd) >>> student.get_cpds()
- get_markov_blanket(node)[source]¶
Returns a markov blanket for a random variable. In the case of Bayesian Networks, the markov blanket is the set of node’s parents, its children and its children’s other parents.
- Returns:
Markov Blanket – List of nodes contained in Markov Blanket of node
- Return type:
- Parameters:
node (string, int or any hashable python object.) – The node whose markov blanket would be returned.
Examples
>>> from pgmpy.models import BayesianNetwork >>> from pgmpy.factors.discrete import TabularCPD >>> G = BayesianNetwork([('x', 'y'), ('z', 'y'), ('y', 'w'), ('y', 'v'), ('u', 'w'), ... ('s', 'v'), ('w', 't'), ('w', 'm'), ('v', 'n'), ('v', 'q')]) >>> G.get_markov_blanket('y') ['s', 'u', 'w', 'v', 'z', 'x']
- static get_random(n_nodes=5, edge_prob=0.5, node_names=None, n_states=None, latents=False)[source]¶
Returns a randomly generated Bayesian Network on n_nodes variables with edge probabiliy of edge_prob between variables.
- Parameters:
n_nodes (int) – The number of nodes in the randomly generated DAG.
edge_prob (float) – The probability of edge between any two nodes in the topologically sorted DAG.
node_names (list (default: None)) – A list of variables names to use in the random graph. If None, the node names are integer values starting from 0.
n_states (int or dict (default: None)) – The number of states of each variable in the form {variable: no_of_states}. If a single value is provided, all nodes will have the same number of states. When None randomly generates the number of states.
latents (bool (default: False)) – If True, also creates latent variables.
- Returns:
Random DAG – The randomly generated DAG.
- Return type:
Examples
>>> from pgmpy.models import BayesianNetwork >>> model = BayesianNetwork.get_random(n_nodes=5) >>> model.nodes() NodeView((0, 1, 3, 4, 2)) >>> model.edges() OutEdgeView([(0, 1), (0, 3), (1, 3), (1, 4), (3, 4), (2, 3)]) >>> model.cpds [<TabularCPD representing P(0:0) at 0x7f97e16eabe0>, <TabularCPD representing P(1:1 | 0:0) at 0x7f97e16ea670>, <TabularCPD representing P(3:3 | 0:0, 1:1, 2:2) at 0x7f97e16820d0>, <TabularCPD representing P(4:4 | 1:1, 3:3) at 0x7f97e16eae80>, <TabularCPD representing P(2:2) at 0x7f97e1682c40>]
- get_random_cpds(n_states=None, inplace=False)[source]¶
Given a model, generates and adds random TabularCPD for each node resulting in a fully parameterized network.
- get_state_probability(states)[source]¶
Given a fully specified Bayesian Network, returns the probability of the given set of states.
- Parameters:
state (dict) – dict of the form {variable: state}
- Returns:
float
- Return type:
The probability value
Examples
>>> from pgmpy.utils import get_example_model >>> model = get_example_model('asia') >>> model.get_state_probability({'either': 'no', 'tub': 'no', 'xray': 'yes', 'bronc': 'no'}) 0.02605122
- is_imap(JPD)[source]¶
Checks whether the Bayesian Network is Imap of given JointProbabilityDistribution
- Parameters:
JPD (An instance of JointProbabilityDistribution Class, for which you want to check the Imap)
- Returns:
is IMAP – True if Bayesian Network is Imap for given Joint Probability Distribution False otherwise
- Return type:
True or False
Examples
>>> from pgmpy.models import BayesianNetwork >>> from pgmpy.factors.discrete import TabularCPD >>> from pgmpy.factors.discrete import JointProbabilityDistribution >>> G = BayesianNetwork([('diff', 'grade'), ('intel', 'grade')]) >>> diff_cpd = TabularCPD('diff', 2, [[0.2], [0.8]]) >>> intel_cpd = TabularCPD('intel', 3, [[0.5], [0.3], [0.2]]) >>> grade_cpd = TabularCPD('grade', 3, ... [[0.1,0.1,0.1,0.1,0.1,0.1], ... [0.1,0.1,0.1,0.1,0.1,0.1], ... [0.8,0.8,0.8,0.8,0.8,0.8]], ... evidence=['diff', 'intel'], ... evidence_card=[2, 3]) >>> G.add_cpds(diff_cpd, intel_cpd, grade_cpd) >>> val = [0.01, 0.01, 0.08, 0.006, 0.006, 0.048, 0.004, 0.004, 0.032, 0.04, 0.04, 0.32, 0.024, 0.024, 0.192, 0.016, 0.016, 0.128] >>> JPD = JointProbabilityDistribution(['diff', 'intel', 'grade'], [2, 3, 3], val) >>> G.is_imap(JPD) True
- static load(filename, filetype='bif', **kwargs)[source]¶
Read the model from a file.
- Parameters:
filename (str) – The path along with the filename where to read the file.
filetype (str (default: bif)) – The format of the model file. Can be one of the following: bif, uai, xmlbif.
kwargs (kwargs) – Any additional arguments for the reader class or get_model method. Please refer the file format class for details.
Examples
>>> from pgmpy.utils import get_example_model >>> alarm = get_example_model('alarm') >>> alarm.save('alarm.bif', filetype='bif') >>> alarm_model = BayesianNetwork.load('alarm.bif', filetype='bif')
- predict(data, stochastic=False, n_jobs=-1)[source]¶
Predicts states of all the missing variables.
- Parameters:
data (pandas DataFrame object) – A DataFrame object with column names same as the variables in the model.
stochastic (boolean) –
If True, does prediction by sampling from the distribution of predicted variable(s). If False, returns the states with the highest probability value (i.e. MAP) for the
predicted variable(s).
n_jobs (int (default: -1)) – The number of CPU cores to use. If -1, uses all available cores.
Examples
>>> import numpy as np >>> import pandas as pd >>> from pgmpy.models import BayesianNetwork >>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 5)), ... columns=['A', 'B', 'C', 'D', 'E']) >>> train_data = values[:800] >>> predict_data = values[800:] >>> model = BayesianNetwork([('A', 'B'), ('C', 'B'), ('C', 'D'), ('B', 'E')]) >>> model.fit(train_data) >>> predict_data = predict_data.copy() >>> predict_data.drop('E', axis=1, inplace=True) >>> y_pred = model.predict(predict_data) >>> y_pred E 800 0 801 1 802 1 803 1 804 0 ... ... 993 0 994 0 995 1 996 1 997 0 998 0 999 0
- predict_probability(data)[source]¶
Predicts probabilities of all states of the missing variables.
- Parameters:
data (pandas DataFrame object) – A DataFrame object with column names same as the variables in the model.
Examples
>>> import numpy as np >>> import pandas as pd >>> from pgmpy.models import BayesianNetwork >>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(100, 5)), ... columns=['A', 'B', 'C', 'D', 'E']) >>> train_data = values[:80] >>> predict_data = values[80:] >>> model = BayesianNetwork([('A', 'B'), ('C', 'B'), ('C', 'D'), ('B', 'E')]) >>> model.fit(values) >>> predict_data = predict_data.copy() >>> predict_data.drop('B', axis=1, inplace=True) >>> y_prob = model.predict_probability(predict_data) >>> y_prob B_0 B_1 80 0.439178 0.560822 81 0.581970 0.418030 82 0.488275 0.511725 83 0.581970 0.418030 84 0.510794 0.489206 85 0.439178 0.560822 86 0.439178 0.560822 87 0.417124 0.582876 88 0.407978 0.592022 89 0.429905 0.570095 90 0.581970 0.418030 91 0.407978 0.592022 92 0.429905 0.570095 93 0.429905 0.570095 94 0.439178 0.560822 95 0.407978 0.592022 96 0.559904 0.440096 97 0.417124 0.582876 98 0.488275 0.511725 99 0.407978 0.592022
- remove_cpds(*cpds)[source]¶
Removes the cpds that are provided in the argument.
- Parameters:
*cpds (TabularCPD object) – A CPD object on any subset of the variables of the model which is to be associated with the model.
Examples
>>> from pgmpy.models import BayesianNetwork >>> from pgmpy.factors.discrete import TabularCPD >>> student = BayesianNetwork([('diff', 'grade'), ('intel', 'grade')]) >>> cpd = TabularCPD('grade', 2, [[0.1, 0.9, 0.2, 0.7], ... [0.9, 0.1, 0.8, 0.3]], ... ['intel', 'diff'], [2, 2]) >>> student.add_cpds(cpd) >>> student.remove_cpds(cpd)
- remove_node(node)[source]¶
Remove node from the model.
Removing a node also removes all the associated edges, removes the CPD of the node and marginalizes the CPDs of its children.
- Parameters:
node (node) – Node which is to be removed from the model.
- Return type:
None
Examples
>>> import pandas as pd >>> import numpy as np >>> from pgmpy.models import BayesianNetwork >>> model = BayesianNetwork([('A', 'B'), ('B', 'C'), ... ('A', 'D'), ('D', 'C')]) >>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 4)), ... columns=['A', 'B', 'C', 'D']) >>> model.fit(values) >>> model.get_cpds() [<TabularCPD representing P(A:2) at 0x7f28248e2438>, <TabularCPD representing P(B:2 | A:2) at 0x7f28248e23c8>, <TabularCPD representing P(C:2 | B:2, D:2) at 0x7f28248e2748>, <TabularCPD representing P(D:2 | A:2) at 0x7f28248e26a0>] >>> model.remove_node('A') >>> model.get_cpds() [<TabularCPD representing P(B:2) at 0x7f28248e23c8>, <TabularCPD representing P(C:2 | B:2, D:2) at 0x7f28248e2748>, <TabularCPD representing P(D:2) at 0x7f28248e26a0>]
- remove_nodes_from(nodes)[source]¶
Remove multiple nodes from the model.
Removing a node also removes all the associated edges, removes the CPD of the node and marginalizes the CPDs of its children.
- Parameters:
nodes (list, set (iterable)) – Nodes which are to be removed from the model.
- Return type:
None
Examples
>>> import pandas as pd >>> import numpy as np >>> from pgmpy.models import BayesianNetwork >>> model = BayesianNetwork([('A', 'B'), ('B', 'C'), ... ('A', 'D'), ('D', 'C')]) >>> values = pd.DataFrame(np.random.randint(low=0, high=2, size=(1000, 4)), ... columns=['A', 'B', 'C', 'D']) >>> model.fit(values) >>> model.get_cpds() [<TabularCPD representing P(A:2) at 0x7f28248e2438>, <TabularCPD representing P(B:2 | A:2) at 0x7f28248e23c8>, <TabularCPD representing P(C:2 | B:2, D:2) at 0x7f28248e2748>, <TabularCPD representing P(D:2 | A:2) at 0x7f28248e26a0>] >>> model.remove_nodes_from(['A', 'B']) >>> model.get_cpds() [<TabularCPD representing P(C:2 | D:2) at 0x7f28248e2a58>, <TabularCPD representing P(D:2) at 0x7f28248e26d8>]
- save(filename, filetype='bif')[source]¶
Writes the model to a file. Plese avoid using any special characters or spaces in variable or state names.
- Parameters:
Examples
>>> from pgmpy.utils import get_example_model >>> alarm = get_example_model('alarm') >>> alarm.save('alarm.bif', filetype='bif')
- simulate(n_samples=10, do=None, evidence=None, virtual_evidence=None, virtual_intervention=None, include_latents=False, partial_samples=None, seed=None, show_progress=True)[source]¶
Simulates data from the given model. Internally uses methods from pgmpy.sampling.BayesianModelSampling to generate the data.
- Parameters:
n_samples (int) – The number of data samples to simulate from the model.
do (dict) – The interventions to apply to the model. dict should be of the form {variable_name: state}
evidence (dict) – Observed evidence to apply to the model. dict should be of the form {variable_name: state}
virtual_evidence (list) – Probabilistically apply evidence to the model. virtual_evidence should be a list of pgmpy.factors.discrete.TabularCPD objects specifying the virtual probabilities.
virtual_intervention (list) – Also known as soft intervention. virtual_intervention should be a list of pgmpy.factors.discrete.TabularCPD objects specifying the virtual/soft intervention probabilities.
include_latents (boolean) – Whether to include the latent variable values in the generated samples.
partial_samples (pandas.DataFrame) – A pandas dataframe specifying samples on some of the variables in the model. If specified, the sampling procedure uses these sample values, instead of generating them. partial_samples.shape[0] must be equal to n_samples.
seed (int (default: None)) – If a value is provided, sets the seed for numpy.random.
show_progress (bool) – If True, shows a progress bar when generating samples.
- Returns:
A dataframe with the simulated data
- Return type:
pd.DataFrame
Examples
>>> from pgmpy.utils import get_example_model
Simulation without any evidence or intervention:
>>> model = get_example_model('alarm') >>> model.simulate(n_samples=10)
Simulation with the hard evidence: MINVOLSET = HIGH:
>>> model.simulate(n_samples=10, evidence={"MINVOLSET": "HIGH"})
Simulation with hard intervention: CVP = LOW:
>>> model.simulate(n_samples=10, do={"CVP": "LOW"})
Simulation with virtual/soft evidence: p(MINVOLSET=LOW) = 0.8, p(MINVOLSET=HIGH) = 0.2, p(MINVOLSET=NORMAL) = 0:
>>> virt_evidence = [TabularCPD("MINVOLSET", 3, [[0.8], [0.0], [0.2]], state_names={"MINVOLSET": ["LOW", "NORMAL", "HIGH"]})] >>> model.simulate(n_samples, virtual_evidence=virt_evidence)
Simulation with virtual/soft intervention: p(CVP=LOW) = 0.2, p(CVP=NORMAL)=0.5, p(CVP=HIGH)=0.3:
>>> virt_intervention = [TabularCPD("CVP", 3, [[0.2], [0.5], [0.3]], state_names={"CVP": ["LOW", "NORMAL", "HIGH"]})] >>> model.simulate(n_samples, virtual_intervention=virt_intervention)
- property states¶
Returns a dictionary mapping each node to its list of possible states.
- Returns:
state_dict – Dictionary of nodes to possible states
- Return type:
- to_junction_tree()[source]¶
Creates a junction tree (or clique tree) for a given Bayesian Network.
For converting a Bayesian Model into a Clique tree, first it is converted into a Markov one.
For a given markov model (H) a junction tree (G) is a graph 1. where each node in G corresponds to a maximal clique in H 2. each sepset in G separates the variables strictly on one side of the edge to other.
Examples
>>> from pgmpy.models import BayesianNetwork >>> from pgmpy.factors.discrete import TabularCPD >>> G = BayesianNetwork([('diff', 'grade'), ('intel', 'grade'), ... ('intel', 'SAT'), ('grade', 'letter')]) >>> diff_cpd = TabularCPD('diff', 2, [[0.2], [0.8]]) >>> intel_cpd = TabularCPD('intel', 3, [[0.5], [0.3], [0.2]]) >>> grade_cpd = TabularCPD('grade', 3, ... [[0.1,0.1,0.1,0.1,0.1,0.1], ... [0.1,0.1,0.1,0.1,0.1,0.1], ... [0.8,0.8,0.8,0.8,0.8,0.8]], ... evidence=['diff', 'intel'], ... evidence_card=[2, 3]) >>> sat_cpd = TabularCPD('SAT', 2, ... [[0.1, 0.2, 0.7], ... [0.9, 0.8, 0.3]], ... evidence=['intel'], evidence_card=[3]) >>> letter_cpd = TabularCPD('letter', 2, ... [[0.1, 0.4, 0.8], ... [0.9, 0.6, 0.2]], ... evidence=['grade'], evidence_card=[3]) >>> G.add_cpds(diff_cpd, intel_cpd, grade_cpd, sat_cpd, letter_cpd) >>> jt = G.to_junction_tree()
- to_markov_model()[source]¶
Converts Bayesian Network to Markov Model. The Markov Model created would be the moral graph of the Bayesian Network.
Examples
>>> from pgmpy.models import BayesianNetwork >>> G = BayesianNetwork([('diff', 'grade'), ('intel', 'grade'), ... ('intel', 'SAT'), ('grade', 'letter')]) >>> mm = G.to_markov_model() >>> mm.nodes() NodeView(('diff', 'grade', 'intel', 'letter', 'SAT')) >>> mm.edges() EdgeView([('diff', 'grade'), ('diff', 'intel'), ('grade', 'letter'), ('grade', 'intel'), ('intel', 'SAT')])