skmultiflow.meta.
MonteCarloClassifierChain
Monte Carlo Sampling Classifier Chains for multi-label learning.
PCC, using Monte Carlo sampling, published as ‘MCC’. M samples are taken from the posterior distribution. Therefore we need a probabilistic interpretation of the output, and thus, this is a particular variety of ProbabilisticClassifierChain. N.B. Multi-label (binary) only at this moment.
PCC, using Monte Carlo sampling, published as ‘MCC’.
M samples are taken from the posterior distribution. Therefore we need a probabilistic interpretation of the output, and thus, this is a particular variety of ProbabilisticClassifierChain.
N.B. Multi-label (binary) only at this moment.
This is the ensemble classifier type, each ensemble classifier is going to be a copy of the base_estimator.
Number of samples to take from the posterior distribution.
If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
Examples
>>> from skmultiflow.data import make_logical >>> >>> X, Y = make_logical(random_state=1) >>> >>> print("TRUE: ") >>> print(Y) >>> print("vs") >>> print("MCC") >>> mcc = MonteCarloClassifierChain() >>> mcc.fit(X, Y) >>> Yp = mcc.predict(X, M=50) >>> print("with 50 iterations ...") >>> print(Yp) >>> Yp = mcc.predict(X, 'default') >>> print("with default (%d) iterations ..." % 1000) >>> print(Yp) TRUE: [[1. 0. 1.] [1. 1. 0.] [0. 0. 0.] [1. 1. 0.]] vs MCC with 50 iterations ... [[1. 0. 1.] [1. 1. 0.] [0. 0. 0.] [1. 1. 0.]] with default (1000) iterations ... [[1. 0. 1.] [1. 1. 0.] [0. 0. 0.] [1. 1. 0.]]
Methods
fit(self, X, y[, classes, sample_weight])
fit
Fit the model.
get_info(self)
get_info
Collects and returns the information about the configuration of the estimator
get_params(self[, deep])
get_params
Get parameters for this estimator.
partial_fit(self, X, y[, classes, sample_weight])
partial_fit
Partially (incrementally) fit the model.
predict(self, X[, M])
predict
Predict classes for the passed data.
predict_proba(self, X)
predict_proba
Estimates the probability of each sample in X belonging to each of the class-labels.
reset(self)
reset
Resets the estimator to its initial state.
sample(self, x)
sample
Sample y ~ P(y|x)
score(self, X, y[, sample_weight])
score
Returns the mean accuracy on the given test data and labels.
set_params(self, **params)
set_params
Set the parameters of this estimator.
The features to train the model.
An array-like with the labels of all samples in X.
Configuration of the estimator.
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Parameter names mapped to their values.
The set of data samples to predict the labels for.
Number of sampling iterations. If None, M is set equal to the M value used for initialization
Notes
Quite similar to the ProbabilisticClassifierChain.predict() function.
Depending on the implementation, y_max, w_max may be initially set to 0, if we wish to rely solely on the sampling. Setting the w_max based on a naive CC prediction gives a good baseline to work from.
The matrix of samples one wants to predict the class probabilities for.
Returns marginals [P(y_1=1|x),…,P(y_L=1|x,y_1,…,y_{L-1})] i.e., confidence predictions given inputs, for each instance.
This function suitable for multi-label (binary) data only at the moment (may give index-out-of-bounds error if uni- or multi-target (of > 2 values) data is used in training).
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Test samples.
True labels for X.
Sample weights.
Mean accuracy of self.predict(X) wrt. y.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
<component>__<parameter>