Exhaustive Search¶
- class pgmpy.estimators.ExhaustiveSearch(data, scoring_method=None, use_cache=True, **kwargs)[source]¶
- all_dags(nodes=None)[source]¶
Computes all possible directed acyclic graphs with a given set of nodes, sparse ones first. 2**(n*(n-1)) graphs need to be searched, given n nodes, so this is likely not feasible for n>6. This is a generator.
- Parameters:
nodes (list of nodes for the DAGs (optional)) – A list of the node names that the generated DAGs should have. If not provided, nodes are taken from data.
- Returns:
dags – Generator that yields all acyclic nx.DiGraphs, ordered by number of edges. Empty DAG first.
- Return type:
Generator object for nx.DiGraphs
Examples
>>> import pandas as pd >>> from pgmpy.estimators import ExhaustiveSearch >>> s = ExhaustiveSearch(pd.DataFrame(data={'Temperature': [23, 19], 'Weather': ['sunny', 'cloudy'], 'Humidity': [65, 75]})) >>> list(s.all_dags()) [<networkx.classes.digraph.DiGraph object at 0x7f6955216438>, <networkx.classes.digraph.DiGraph object at 0x7f6955216518>, .... >>> [dag.edges() for dag in s.all_dags()] [[], [('Humidity', 'Temperature')], [('Humidity', 'Weather')], [('Temperature', 'Weather')], [('Temperature', 'Humidity')], .... [('Weather', 'Humidity'), ('Weather', 'Temperature'), ('Temperature', 'Humidity')]]
- all_scores()[source]¶
Computes a list of DAGs and their structure scores, ordered by score.
- Returns:
A list of (score, dag) pairs – A list of (score, dag)-tuples, where score is a float and model a acyclic nx.DiGraph. The list is ordered by score values.
- Return type:
Examples
>>> import pandas as pd >>> import numpy as np >>> from pgmpy.estimators import ExhaustiveSearch, K2Score >>> # create random data sample with 3 variables, where B and C are identical: >>> data = pd.DataFrame(np.random.randint(0, 5, size=(5000, 2)), columns=list('AB')) >>> data['C'] = data['B'] >>> searcher = ExhaustiveSearch(data, scoring_method=K2Score(data)) >>> for score, model in searcher.all_scores(): ... print("{0} {1}".format(score, model.edges())) -24234.44977974726 [('A', 'B'), ('A', 'C')] -24234.449760691063 [('A', 'B'), ('C', 'A')] -24234.449760691063 [('A', 'C'), ('B', 'A')] -24203.700955937973 [('A', 'B')] -24203.700955937973 [('A', 'C')] -24203.700936881774 [('B', 'A')] -24203.700936881774 [('C', 'A')] -24203.700936881774 [('B', 'A'), ('C', 'A')] -24172.952132128685 [] -16597.30920265254 [('A', 'B'), ('A', 'C'), ('B', 'C')] -16597.30920265254 [('A', 'B'), ('A', 'C'), ('C', 'B')] -16597.309183596342 [('A', 'B'), ('C', 'A'), ('C', 'B')] -16597.309183596342 [('A', 'C'), ('B', 'A'), ('B', 'C')] -16566.560378843253 [('A', 'B'), ('C', 'B')] -16566.560378843253 [('A', 'C'), ('B', 'C')] -16268.324549347722 [('A', 'B'), ('B', 'C')] -16268.324549347722 [('A', 'C'), ('C', 'B')] -16268.324530291524 [('B', 'A'), ('B', 'C')] -16268.324530291524 [('B', 'C'), ('C', 'A')] -16268.324530291524 [('B', 'A'), ('C', 'B')] -16268.324530291524 [('C', 'A'), ('C', 'B')] -16268.324530291524 [('B', 'A'), ('B', 'C'), ('C', 'A')] -16268.324530291524 [('B', 'A'), ('C', 'A'), ('C', 'B')] -16237.575725538434 [('B', 'C')] -16237.575725538434 [('C', 'B')]
- estimate()[source]¶
Estimates the DAG structure that fits best to the given data set, according to the scoring method supplied in the constructor. Exhaustively searches through all models. Only estimates network structure, no parametrization.
- Returns:
Estimated Model – A DAG with maximal score.
- Return type:
Examples
>>> import pandas as pd >>> import numpy as np >>> from pgmpy.estimators import ExhaustiveSearch >>> # create random data sample with 3 variables, where B and C are identical: >>> data = pd.DataFrame(np.random.randint(0, 5, size=(5000, 2)), columns=list('AB')) >>> data['C'] = data['B'] >>> est = ExhaustiveSearch(data) >>> best_model = est.estimate() >>> best_model <pgmpy.base.DAG.DAG object at 0x7f695c535470> >>> best_model.edges() [('B', 'C')]