skmultiflow.bayes.
NaiveBayes
Naive Bayes classifier.
Performs classic bayesian prediction while making naive assumption that all inputs are independent. Naive Bayes is a classifier algorithm known for its simplicity and low computational cost. Given n different classes, the trained Naive Bayes classifier predicts for every unlabelled instance the class to which it belongs with high accuracy.
List of Nominal attributes. If emtpy, then assume that all attributes are numerical.
Notes
The scikit-learn implementations of NaiveBayes are compatible with scikit-multiflow with the caveat that they must be partially fitted before use. In the scikit-multiflow evaluators this is done by setting pretrain_size>0.
Examples
>>> # Imports >>> from skmultiflow.data import SEAGenerator >>> from skmultiflow.bayes import NaiveBayes >>> >>> # Setup a data stream >>> stream = SEAGenerator(random_state=1) >>> >>> # Setup Naive Bayes estimator >>> naive_bayes = NaiveBayes() >>> >>> # Setup variables to control loop and track performance >>> n_samples = 0 >>> correct_cnt = 0 >>> max_samples = 200 >>> >>> # Train the estimator with the samples provided by the data stream >>> while n_samples < max_samples and stream.has_more_samples(): >>> X, y = stream.next_sample() >>> y_pred = naive_bayes.predict(X) >>> if y[0] == y_pred[0]: >>> correct_cnt += 1 >>> naive_bayes.partial_fit(X, y) >>> n_samples += 1 >>> >>> # Display results >>> print('{} samples analyzed.'.format(n_samples)) >>> print('Naive Bayes accuracy: {}'.format(correct_cnt / n_samples))
Methods
fit(self, X, y[, classes, sample_weight])
fit
Fit the model.
get_info(self)
get_info
Collects and returns the information about the configuration of the estimator
get_params(self[, deep])
get_params
Get parameters for this estimator.
partial_fit(self, X, y[, classes, sample_weight])
partial_fit
Partially (incrementally) fit the model.
predict(self, X)
predict
Predict classes for the passed data.
predict_proba(self, X)
predict_proba
Estimates the probability of each sample in X belonging to each of the class-labels.
reset(self)
reset
Resets the estimator to its initial state.
score(self, X, y[, sample_weight])
score
Returns the mean accuracy on the given test data and labels.
set_params(self, **params)
set_params
Set the parameters of this estimator.
The features to train the model.
An array-like with the class labels of all samples in X.
Contains all possible/known class labels. Usage varies depending on the learning method.
Samples weight. If not provided, uniform weights are assumed. Usage varies depending on the learning method.
Configuration of the estimator.
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Parameter names mapped to their values.
An array-like with the labels of all samples in X.
Array with all possible/known classes. Usage varies depending on the learning method.
self
The set of data samples to predict the labels for.
The matrix of samples one wants to predict the class probabilities for.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Test samples.
True labels for X.
Sample weights.
Mean accuracy of self.predict(X) wrt. y.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
<component>__<parameter>