skmultiflow.meta.
AdditiveExpertEnsembleClassifier
Additive Expert ensemble classifier.
Maximum number of estimators to hold.
Each member of the ensemble is an instance of the base estimator.
Factor for which to decrease weights by.
Weight of new experts in ratio to total ensemble weight.
Pruning strategy to use.
Notes
The Additive Expert Ensemble (AddExp) [1] is a general method for using any online learner for drifting concepts. Using the ‘oldest’ pruning strategy leads to known mistake and error bounds, but using ‘weakest’ is generally better performing.
Bound on mistakes when using ‘oldest’ pruning strategy (theorem 3.1 in the paper): Let \(W_i\) denote the total weight of the ensemble at time step \(i\), and \(M_i\) the number of mistakes of the ensemble at all time steps up to \(i-1\); then for any time step \(t_1 < t_2\), and if we stipulate that \(\beta + 2 * \gamma < 1\) then
\(M_2 - M_1 \leq log(W_1 - W_2) / log(2 / (1 + \beta + 2 * \gamma ))\)
References
Kolter and Maloof. Using additive expert ensembles to cope with Concept drift. Proc. 22 International Conference on Machine Learning, 2005.
Examples
>>> # Imports >>> from skmultiflow.data import SEAGenerator >>> from skmultiflow.meta import AdditiveExpertEnsembleClassifier >>> >>> # Setup a data stream >>> stream = SEAGenerator(random_state=1) >>> >>> # Setup Additive Expert Ensemble Classifier >>> add_exp = AdditiveExpertEnsembleClassifier() >>> >>> # Setup variables to control loop and track performance >>> n_samples = 0 >>> correct_cnt = 0 >>> max_samples = 200 >>> >>> # Train the classifier with the samples provided by the data stream >>> while n_samples < max_samples and stream.has_more_samples(): >>> X, y = stream.next_sample() >>> y_pred = add_exp.predict(X) >>> if y[0] == y_pred[0]: >>> correct_cnt += 1 >>> add_exp.partial_fit(X, y) >>> n_samples += 1 >>> >>> # Display results >>> print('{} samples analyzed'.format(n_samples)) >>> print('Additive Expert Ensemble accuracy: {}'.format(correct_cnt / n_samples))
Methods
fit(self, X, y[, classes, sample_weight])
fit
Fit the model.
fit_single_sample(self, X, y[, classes, …])
fit_single_sample
Predict + update weights + modify experts + train on new sample.
get_expert_predictions(self, X)
get_expert_predictions
Returns predictions of each class for each expert.
get_info(self)
get_info
Collects and returns the information about the configuration of the estimator
get_params(self[, deep])
get_params
Get parameters for this estimator.
partial_fit(self, X, y[, classes, sample_weight])
partial_fit
Partially fits the model on the supplied X and y matrices.
predict(self, X)
predict
Predicts the class labels of X in a general classification setting.
predict_proba(self, X)
predict_proba
Not implemented for this method.
reset(self)
reset
Resets the estimator to its initial state.
score(self, X, y[, sample_weight])
score
Returns the mean accuracy on the given test data and labels.
set_params(self, **params)
set_params
Set the parameters of this estimator.
WeightedExpert
Wrapper that includes an estimator and its weight.
The estimator to wrap.
The estimator’s weight.
The features to train the model.
An array-like with the class labels of all samples in X.
Contains all possible/known class labels. Usage varies depending on the learning method.
Samples weight. If not provided, uniform weights are assumed. Usage varies depending on the learning method.
Predict + update weights + modify experts + train on new sample. (As described in the original paper.)
Returns predictions of each class for each expert. In shape: (n_experts,)
Configuration of the estimator.
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Parameter names mapped to their values.
Since it’s an ensemble learner, if X and y matrix of more than one sample are passed, the algorithm will partial fit the model one sample at a time.
Features matrix used for partially updating the model.
An array-like of all the class labels for the samples in X.
Array with all possible/known class labels.
self
The predict function will take an average of the predictions of its learners, weighted by their respective weights, and return the most likely class.
A matrix of the samples we want to predict.
A numpy.ndarray with the label prediction for all the samples in X.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Test samples.
True labels for X.
Sample weights.
Mean accuracy of self.predict(X) wrt. y.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
<component>__<parameter>