skmultiflow.meta.
BatchIncrementalClassifier
Batch Incremental ensemble classifier.
This is a wrapper that allows the application of any batch model to a stream by incrementally building an ensemble of instances of the batch model. A window of examples is collected, then used to train a new model, which is added to the ensemble. A maximum number of models ensures memory use is finite (the oldest model is deleted when this number is exceeded).
Each member of the ensemble is an instance of the base estimator
The size of the training window (batch), in other words, how many instances are kept for training.
Number of estimators in the ensemble.
Notes
Not yet multi-label capable.
Examples
>>> # Imports >>> from skmultiflow.data import SEAGenerator >>> from skmultiflow.meta import BatchIncrementalClassifier >>> >>> # Setup a data stream >>> stream = SEAGenerator(random_state=1) >>> >>> # Pre-training the classifier with 200 samples >>> X, y = stream.next_sample(200) >>> batch_incremental_cfier = BatchIncrementalClassifier() >>> batch_incremental_cfier.partial_fit(X, y) >>> >>> # Preparing the processing of 5000 samples and correct prediction count >>> n_samples = 0 >>> correct_cnt = 0 >>> while n_samples < 5000 and stream.has_more_samples(): >>> X, y = stream.next_sample() >>> y_pred = batch_incremental_cfier.predict(X) >>> if y[0] == y_pred[0]: >>> correct_cnt += 1 >>> batch_incremental_cfier.partial_fit(X, y) >>> n_samples += 1 >>> >>> # Display results >>> print('Batch Incremental ensemble classifier example') >>> print('{} samples analyzed'.format(n_samples)) >>> print('Performance: {}'.format(correct_cnt / n_samples))
Methods
fit(self, X, y[, classes, sample_weight])
fit
Fit the model.
get_info(self)
get_info
Collects and returns the information about the configuration of the estimator
get_params(self[, deep])
get_params
Get parameters for this estimator.
partial_fit(self, X[, y, classes, sample_weight])
partial_fit
Partially (incrementally) fit the model.
predict(self, X)
predict
Predict classes for the passed data.
predict_proba(self, X)
predict_proba
Estimates the probability of each sample in X belonging to each of the class-labels.
reset(self)
reset
Resets the estimator to its initial state.
score(self, X, y[, sample_weight])
score
Returns the mean accuracy on the given test data and labels.
set_params(self, **params)
set_params
Set the parameters of this estimator.
The features to train the model.
An array-like with the class labels of all samples in X.
Contains all possible/known class labels. Usage varies depending on the learning method.
Samples weight. If not provided, uniform weights are assumed. Usage varies depending on the learning method.
Configuration of the estimator.
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Parameter names mapped to their values.
An array-like with the labels of all samples in X.
Samples weight. If not provided, uniform weights are assumed.
The set of data samples to predict the labels for.
The matrix of samples one wants to predict the class probabilities for.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Test samples.
True labels for X.
Sample weights.
Mean accuracy of self.predict(X) wrt. y.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
<component>__<parameter>