skmultiflow.meta.MultiOutputLearner¶

class skmultiflow.meta.MultiOutputLearner(base_estimator=SGDClassifier(max_iter=100))[source]¶

Multi-Output Learner for multi-target classification or regression.

Parameters

base_estimator: skmultiflow.core.BaseSKMObject or sklearn.BaseEstimator (default=SGDClassifier(max_iter=100)): Each member of the ensemble is an instance of the base estimator.

Notes

Use this meta learner to make single output predictors capable of learning a multi output problem, by applying them individually to each output. In the classification context, this is the “binary relevance” estimator.

A Multi-Output Learner model learns to predict multiple outputs for each instance. The outputs may either be discrete (i.e., classification), or continuous (i.e., regression). This estimator takes any base learner (which by default is LogisticRegression) and builds a separate model for each output, and will distribute each instance to each model for individual learning and prediction.

Examples

>>> from skmultiflow.meta.multi_output_learner import MultiOutputLearner
>>> from skmultiflow.data.file_stream import FileStream
>>> from sklearn.linear_model import Perceptron
>>> # Setup the file stream
>>> stream = FileStream("https://raw.githubusercontent.com/scikit-multiflow/"
...                     "streaming-datasets/master/moving_squares.csv", 0, 6)
>>> # Setup the MultiOutputLearner using sklearn Perceptron
>>> classifier = MultiOutputLearner(base_estimator=Perceptron())
>>> # Setup the pipeline
>>> # Pre training the classifier with 150 samples
>>> X, y = stream.next_sample(150)
>>> classifier.partial_fit(X, y, classes=stream.target_values)
>>> # Keeping track of sample count, true labels and predictions to later 
>>> # compute the classifier's hamming score
>>> count = 0
>>> true_labels = []
>>> predicts = []
>>> while stream.has_more_samples():
...     X, y = stream.next_sample()
...     p = classifier.predict(X)
...     classifier.partial_fit(X, y)
...     predicts.extend(p)
...     true_labels.extend(y)
...     count += 1
>>>
>>> perf = hamming_score(true_labels, predicts)
>>> print('Total samples analyzed: ' + str(count))
>>> print("The classifier's static Hamming score    : " + str(perf))

Methods

`fit`(self, X, y[, classes, sample_weight])	Fit the model.
`get_info`(self)	Collects and returns the information about the configuration of the estimator
`get_params`(self[, deep])	Get parameters for this estimator.
`partial_fit`(self, X, y[, classes, sample_weight])	Partially (incrementally) fit the model.
`predict`(self, X)	Predict target values for the passed data.
`predict_proba`(self, X)	Estimates the probability of each sample in X belonging to each of the existing labels for each of the classification tasks.
`reset`(self)	Resets the estimator to its initial state.
`score`(self, X, y[, sample_weight])	Returns the mean accuracy on the given test data and labels.
`set_params`(self, **params)	Set the parameters of this estimator.

fit(self, X, y, classes=None, sample_weight=None)[source]¶

Fit the model.

Fit n-estimators, one for each learning task.

Parameters

Xnumpy.ndarray of shape (n_samples, n_features): The features to train the model.
y: numpy.ndarray of shape (n_samples, n_targets): An array-like with the target values of all samples in X.
classes: numpy.ndarray, optional (default=None): Array with all possible/known class labels. Usage varies depending on the base estimator. Not used for regression.
sample_weight: numpy.ndarray of shape (n_samples), optional (default=None): Samples weight. If not provided, uniform weights are assumed. Usage varies depending on the base estimator.

Returns

MultiOutputLearner: self

get_info(self)[source]¶

Collects and returns the information about the configuration of the estimator

Returns

string: Configuration of the estimator.

get_params(self, deep=True)[source]¶

Get parameters for this estimator.

Parameters

deepboolean, optional: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsmapping of string to any: Parameter names mapped to their values.

partial_fit(self, X, y, classes=None, sample_weight=None)[source]¶

Partially (incrementally) fit the model.

Partially fit each of the estimators on the X matrix and the corresponding y matrix.

Parameters

Xnumpy.ndarray of shape (n_samples, n_features): The features to train the model.
y: numpy.ndarray of shape (n_samples, n_targets): An array-like with the target values of all samples in X.
classes: numpy.ndarray, optional (default=None): Array with all possible/known class labels. This is an optional parameter, except for the first partial_fit call where it is compulsory. Not used for regression.
sample_weight: numpy.ndarray of shape (n_samples), optional (default=None): Samples weight. If not provided, uniform weights are assumed. Usage varies depending on the base estimator.

Returns

MultiOutputLearner: self

predict(self, X)[source]¶

Predict target values for the passed data.

Iterates over all the estimators, predicting with each one, to obtain the multi output prediction.

Parameters

Xnumpy.ndarray of shape (n_samples, n_features): The set of data samples to predict the target values for.

Returns

numpy.ndarray: numpy.ndarray of shape (n_samples, n_targets) All the predictions for the samples in X.

predict_proba(self, X)[source]¶

Estimates the probability of each sample in X belonging to each of the existing labels for each of the classification tasks.

It’s a simple call to all of the classifier’s predict_proba function, return the probabilities for all the classification problems.

Not applicable for regression tasks.

Parameters

Xnumpy.ndarray of shape (n_samples, n_features): The set of data samples to predict the class labels for.

Returns

numpy.ndarray: An array of shape (n_samples, n_classification_tasks, n_labels), in which we store the probability that each sample in X belongs to each of the labels, in each of the classification tasks.

reset(self)[source]¶

Resets the estimator to its initial state.

Returns

self

score(self, X, y, sample_weight=None)[source]¶

Returns the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters

Xarray-like, shape = (n_samples, n_features): Test samples.
yarray-like, shape = (n_samples) or (n_samples, n_outputs): True labels for X.
sample_weightarray-like, shape = [n_samples], optional: Sample weights.

Returns

scorefloat: Mean accuracy of self.predict(X) wrt. y.

set_params(self, **params)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns

self