NaiveBayes#
- class pgmpy.models.NaiveBayes(*args, backend=None, **kwargs)[source]#
Bases:
DiscreteBayesianNetworkClass to represent Naive Bayes. Naive Bayes is a special case of Bayesian Model where the only edges in the model are from the feature variables to the dependent variable.
- active_trail_nodes(start, observed=None)[source]#
Returns all the nodes reachable from start via an active trail.
- Parameters:
- start: Graph node
- observedList of nodes (optional)
If given the active trail would be computed assuming these nodes to be observed.
Examples
>>> from pgmpy.models import NaiveBayes >>> model = NaiveBayes() >>> model.add_nodes_from(["a", "b", "c", "d"]) >>> model.add_edges_from([("a", "b"), ("a", "c"), ("a", "d")]) >>> sorted(model.active_trail_nodes("a")) ['a', 'b', 'c', 'd'] >>> sorted(model.active_trail_nodes("a", ["b", "c"])) ['a', 'd'] >>> model.active_trail_nodes("b", ["a"]) {'b'}
- add_edge(u, v, *kwargs)[source]#
Add an edge between u and v.
The nodes u and v will be automatically added if they are not already in the graph. u will be the dependent variable (i.e. variable to be predicted) and v will be one of the features (i.e. predictors) in the model.
- Parameters:
- u, vnodes
Nodes can be any hashable python object.
- Returns:
- None
Examples
>>> from pgmpy.models import NaiveBayes >>> G = NaiveBayes() >>> G.add_nodes_from(["a", "b", "c"]) >>> G.add_edge("a", "b") >>> G.add_edge("a", "c") >>> G.edges() OutEdgeView([('a', 'b'), ('a', 'c')])
- add_edges_from(ebunch)[source]#
Adds edges to the model.
Each tuple of the form (u, v) in ebunch adds a new edge in the model. Since there can only be one dependent variable in a Naive Bayes model, u should be the same for each tuple in ebunch.
- Parameters:
- ebunch: list (array-like)
A list of tuples of the form (u, v) representing an edge from u to v.
- Returns:
- None
Examples
>>> from pgmpy.models import NaiveBayes >>> G = NaiveBayes() >>> G.add_nodes_from(["a", "b", "c"]) >>> G.add_edges_from([("a", "b"), ("a", "c")]) >>> G.edges() OutEdgeView([('a', 'b'), ('a', 'c')])
- fit(data, parent_node=None, estimator=None)[source]#
Computes the CPD for each node from a given data in the form of a pandas dataframe. If a variable from the data is not present in the model, it adds that node into the model.
- Parameters:
- datapandas DataFrame object
A DataFrame object with column names same as the variable names of network
- parent_node: any hashable python object (optional)
Parent node of the model, if not specified it looks for a previously specified parent node.
- estimator: Estimator class
Any pgmpy estimator. If nothing is specified, the default
MaximumLikelihoodEstimatorwould be used.
Examples
>>> import numpy as np >>> import pandas as pd >>> from pgmpy.models import NaiveBayes >>> model = NaiveBayes() >>> values = pd.DataFrame( ... np.random.randint(low=0, high=2, size=(1000, 5)), ... columns=["A", "B", "C", "D", "E"], ... ) >>> model.fit(values, "A") >>> model.get_cpds() [<TabularCPD representing P(A:2) at 0x...>, <TabularCPD representing P(B:2 | A:2) at 0x...>, <TabularCPD representing P(C:2 | A:2) at 0x...>, <TabularCPD representing P(D:2 | A:2) at 0x...>, <TabularCPD representing P(E:2 | A:2) at 0x...>] >>> sorted(model.edges()) [('A', 'B'), ('A', 'C'), ('A', 'D'), ('A', 'E')]
- local_independencies(variables)[source]#
Returns an instance of Independencies containing the local independencies of each of the variables.
- Parameters:
- variables: str or array like
variables whose local independencies are to found.
Examples
>>> from pgmpy.models import NaiveBayes >>> model = NaiveBayes() >>> model.add_edges_from([("a", "b"), ("a", "c"), ("a", "d")]) >>> ind = model.local_independencies("b") >>> assertion = ind.get_assertions()[0] >>> sorted(assertion.event1), sorted(assertion.event2), sorted(assertion.event3) (['b'], ['c', 'd'], ['a'])