skmultiflow.meta.MonteCarloClassifierChain¶

class skmultiflow.meta.MonteCarloClassifierChain(base_estimator=LogisticRegression(), M=10, random_state=None)[source]¶

Monte Carlo Sampling Classifier Chains for multi-label learning.

PCC, using Monte Carlo sampling, published as ‘MCC’.

M samples are taken from the posterior distribution. Therefore we need a probabilistic interpretation of the output, and thus, this is a particular variety of ProbabilisticClassifierChain.

N.B. Multi-label (binary) only at this moment.

Parameters

base_estimator: StreamModel or sklearn model: This is the ensemble classifier type, each ensemble classifier is going to be a copy of the base_estimator.
M: int (default=10): Number of samples to take from the posterior distribution.
random_state: int, RandomState instance or None, optional (default=None): If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

Examples

>>> from skmultiflow.data import make_logical
>>>
>>> X, Y = make_logical(random_state=1)
>>>
>>> print("TRUE: ")
>>> print(Y)
>>> print("vs")
>>> print("MCC")
>>> mcc = MonteCarloClassifierChain()
>>> mcc.fit(X, Y)
>>> Yp = mcc.predict(X, M=50)
>>> print("with 50 iterations ...")
>>> print(Yp)
>>> Yp = mcc.predict(X, 'default')
>>> print("with default (%d) iterations ..." % 1000)
>>> print(Yp)
TRUE:
[[1. 0. 1.]
 [1. 1. 0.]
 [0. 0. 0.]
 [1. 1. 0.]]
vs
MCC
with 50 iterations ...
[[1. 0. 1.]
 [1. 1. 0.]
 [0. 0. 0.]
 [1. 1. 0.]]
with default (1000) iterations ...
[[1. 0. 1.]
 [1. 1. 0.]
 [0. 0. 0.]
 [1. 1. 0.]]

Methods

`fit`(self, X, y[, classes, sample_weight])	Fit the model.
`get_info`(self)	Collects and returns the information about the configuration of the estimator
`get_params`(self[, deep])	Get parameters for this estimator.
`partial_fit`(self, X, y[, classes, sample_weight])	Partially (incrementally) fit the model.
`predict`(self, X[, M])	Predict classes for the passed data.
`predict_proba`(self, X)	Estimates the probability of each sample in X belonging to each of the class-labels.
`reset`(self)	Resets the estimator to its initial state.
`sample`(self, x)	Sample y ~ P(y\|x)
`score`(self, X, y[, sample_weight])	Returns the mean accuracy on the given test data and labels.
`set_params`(self, **params)	Set the parameters of this estimator.

fit(self, X, y, classes=None, sample_weight=None)[source]¶

Fit the model.

Parameters

Xnumpy.ndarray of shape (n_samples, n_features): The features to train the model.
y: numpy.ndarray of shape (n_samples, n_targets): An array-like with the labels of all samples in X.
classes: Not used (default=None)
sample_weight: Not used (default=None)

Returns

self

get_info(self)[source]¶

Collects and returns the information about the configuration of the estimator

Returns

string: Configuration of the estimator.

get_params(self, deep=True)[source]¶

Get parameters for this estimator.

Parameters

deepboolean, optional: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsmapping of string to any: Parameter names mapped to their values.

partial_fit(self, X, y, classes=None, sample_weight=None)[source]¶

Partially (incrementally) fit the model.

Parameters

Xnumpy.ndarray of shape (n_samples, n_features): The features to train the model.
y: numpy.ndarray of shape (n_samples): An array-like with the labels of all samples in X.
classes: Not used (default=None)
sample_weight: NOT used (default=None)

Returns

self

predict(self, X, M=None)[source]¶

Predict classes for the passed data.

Parameters

Xnumpy.ndarray of shape (n_samples, n_features): The set of data samples to predict the labels for.
M: int (optional, default=None): Number of sampling iterations. If None, M is set equal to the M value used for initialization

Returns

A numpy.ndarray with all the predictions for the samples in X.

Notes

Quite similar to the ProbabilisticClassifierChain.predict() function.

Depending on the implementation, y_max, w_max may be initially set to 0, if we wish to rely solely on the sampling. Setting the w_max based on a naive CC prediction gives a good baseline to work from.

predict_proba(self, X)[source]¶

Estimates the probability of each sample in X belonging to each of the class-labels.

Parameters

Xnumpy.ndarray of shape (n_samples, n_features): The matrix of samples one wants to predict the class probabilities for.

Returns

A numpy.ndarray of shape (n_samples, n_labels), in which each outer entry is associated with the X entry of the
same index. And where the list in index [i] contains len(self.target_values) elements, each of which represents
the probability that the i-th sample of X belongs to a certain class-label.

Notes

Returns marginals [P(y_1=1|x),…,P(y_L=1|x,y_1,…,y_{L-1})] i.e., confidence predictions given inputs, for each instance.

This function suitable for multi-label (binary) data only at the moment (may give index-out-of-bounds error if uni- or multi-target (of > 2 values) data is used in training).

reset(self)[source]¶

Resets the estimator to its initial state.

Returns

self

sample(self, x)[source]¶

Sample y ~ P(y|x)

Returns

y: a sampled label vector
p: the associated probabilities, i.e., p(y_j=1)=p_j

score(self, X, y, sample_weight=None)[source]¶

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters

Xarray-like, shape = (n_samples, n_features): Test samples.
yarray-like, shape = (n_samples) or (n_samples, n_outputs): True labels for X.
sample_weightarray-like, shape = [n_samples], optional: Sample weights.

Returns

scorefloat: Mean accuracy of self.predict(X) wrt. y.

set_params(self, **params)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns

self