skmultiflow.core.
Pipeline
[Experimental] Holds a set of sequential operation (transforms), followed by a single estimator.
It allows for easy manipulation of datasets that may require several transformation processes before being used by a learner. Also allows for the cross-validation of several steps.
Each of the intermediate steps should be an extension of the BaseTransform class, or at least implement the transform and partial_fit functions or the partial_fit_transform.
The last step should be an estimator (learner), so it should implement partial_fit, and predict at least.
Since it has an estimator as the last step, the Pipeline will act like an estimator itself, in a way that it can be directly passed to evaluation objects, as if it was a learner.
Tuple list containing the set of transforms and the final estimator. It doesn’t need to contain a transform type object, but the estimator is required. Each tuple should be of the format (‘name’, estimator).
Notes
This code is an experimental feature. Use with caution.
Examples
>>> # Imports >>> from skmultiflow.lazy import KNNADWINClassifier >>> from skmultiflow.core import Pipeline >>> from skmultiflow.data import FileStream >>> from skmultiflow.evaluation import EvaluatePrequential >>> from skmultiflow.transform import OneHotToCategorical >>> # Setting up the stream >>> stream = FileStream("https://raw.githubusercontent.com/scikit-multiflow/" ... "streaming-datasets/master/covtype.csv") >>> transform = OneHotToCategorical([[10, 11, 12, 13], ... [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, ... 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53]]) >>> # Setting up the classifier >>> classifier = KNNADWINClassifier(n_neighbors=8, max_window_size=2000, leaf_size=40) >>> # Setup the pipeline >>> pipe = Pipeline([('transform', transform), ('passive_aggressive', classifier)]) >>> # Setup the evaluator >>> evaluator = EvaluatePrequential(show_plot=True, pretrain_size=1000, max_samples=500000) >>> # Evaluate >>> evaluator.evaluate(stream=stream, model=pipe)
Methods
fit(self, X, y)
fit
Sequentially fit and transform data in all but last step, then fit the model in last step.
get_info(self)
get_info
Collects and returns the information about the configuration of the estimator
get_params(self[, deep])
get_params
Get parameters for this estimator.
named_steps(self)
named_steps
Generates a dictionary to access all the steps’ properties.
partial_fit(self, X, y[, classes])
partial_fit
Sequentially partial fit and transform data in all but last step, then partial fit data in last step.
partial_fit_predict(self, X, y)
partial_fit_predict
Partial fits and transforms data in all but last step, then partial fits and predicts in the last step
partial_fit_transform(self, X[, y])
partial_fit_transform
Partial fits and transforms data in all but last step, then partial_fit in last step
predict(self, X)
predict
Sequentially applies all transforms and then predict with last step.
reset(self)
reset
Resets the estimator to its initial state.
set_params(self, **params)
set_params
Set the parameters of this estimator.
The data upon which the transforms/estimator will create their model.
Contains the true class labels for all the samples in X.
self
Configuration of the estimator.
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Parameter names mapped to their values.
A steps dictionary, so that each step can be accessed by name.
The features to train the model.
An array-like with the class labels of all samples in X.
Array with all possible/known class labels. This is an optional parameter, except for the first partial_fit call where it is compulsory.
All the samples we want to predict the label for.
Contains the true class labels for all the samples in X
The predicted class label for all the samples in X.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
<component>__<parameter>