skmultiflow.drift_detection.EDDM¶

class skmultiflow.drift_detection.EDDM[source]¶

Early Drift Detection Method.

Notes

EDDM (Early Drift Detection Method) [1] aims to improve the detection rate of gradual concept drift in DDM, while keeping a good performance against abrupt concept drift.

This method works by keeping track of the average distance between two errors instead of only the error rate. For this, it is necessary to keep track of the running average distance and the running standard deviation, as well as the maximum distance and the maximum standard deviation.

The algorithm works similarly to the DDM algorithm, by keeping track of statistics only. It works with the running average distance (\(p_i^'\)) and the running standard deviation (\(s_i^'\)), as well as \(p^'_{max}\) and \(s^'_{max}\), which are the values of \(p_i^'\) and \(s_i^'\) when \((p_i^' + 2 * s_i^')\) reaches its maximum.

Like DDM, there are two threshold values that define the borderline between no change, warning zone, and drift detected. These are as follows:

if \((p_i^' + 2 * s_i^')/(p^'_{max} + 2 * s^'_{max}) < lpha\) -> Warning zone
if \((p_i^' + 2 * s_i^')/(p^'_{max} + 2 * s^'_{max}) < eta\) -> Change detected

\(lpha\) and \(eta\) are set to 0.95 and 0.9, respectively.

References

1: Early Drift Detection Method. Manuel Baena-Garcia, Jose Del Campo-Avila, Raúl Fidalgo, Albert Bifet, Ricard Gavalda, Rafael Morales-Bueno. In Fourth International Workshop on Knowledge Discovery from Data Streams, 2006.

Examples

>>> # Imports
>>> import numpy as np
>>> from skmultiflow.drift_detection.eddm import EDDM
>>> eddm = EDDM()
>>> # Simulating a data stream as a normal distribution of 1's and 0's
>>> data_stream = np.random.randint(2, size=2000)
>>> # Changing the data concept from index 999 to 1500, simulating an 
>>> # increase in error rate
>>> for i in range(999, 1500):
...     data_stream[i] = 0
>>> # Adding stream elements to EDDM and verifying if drift occurred
>>> for i in range(2000):
...     eddm.add_element(data_stream[i])
...     if eddm.detected_warning_zone():
...         print('Warning zone has been detected in data: ' + str(data_stream[i]) + ' - of index: ' + str(i))
...     if eddm.detected_change():
...         print('Change has been detected in data: ' + str(data_stream[i]) + ' - of index: ' + str(i))

Methods

`add_element`(self, prediction)	Add a new element to the statistics
`detected_change`(self)	This function returns whether concept drift was detected or not.
`detected_warning_zone`(self)	If the change detector supports the warning zone, this function will return whether it’s inside the warning zone or not.
`get_info`(self)	Collects and returns the information about the configuration of the estimator
`get_length_estimation`(self)	Returns the length estimation.
`get_params`(self[, deep])	Get parameters for this estimator.
`reset`(self)	Resets the change detector parameters.
`set_params`(self, **params)	Set the parameters of this estimator.

Attributes

`FDDM_MIN_NUM_INSTANCES`
`FDDM_OUTCONTROL`
`FDDM_WARNING`
`estimator_type`

add_element(self, prediction)[source]¶

Add a new element to the statistics

Parameters

prediction: int (either 0 or 1): This parameter indicates whether the last sample analyzed was correctly classified or not. 1 indicates an error (miss-classification).

Returns

EDDM: self

Notes

After calling this method, to verify if change was detected or if the learner is in the warning zone, one should call the super method detected_change, which returns True if concept drift was detected and False otherwise.

detected_change(self)[source]¶

This function returns whether concept drift was detected or not.

Returns

bool: Whether concept drift was detected or not.

detected_warning_zone(self)[source]¶

If the change detector supports the warning zone, this function will return whether it’s inside the warning zone or not.

Returns

bool: Whether the change detector is in the warning zone or not.

get_info(self)[source]¶

Collects and returns the information about the configuration of the estimator

Returns

string: Configuration of the estimator.

get_length_estimation(self)[source]¶

Returns the length estimation.

Returns

int: The length estimation

get_params(self, deep=True)[source]¶

Get parameters for this estimator.

Parameters

deepboolean, optional: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsmapping of string to any: Parameter names mapped to their values.

reset(self)[source]¶: Resets the change detector parameters.

set_params(self, **params)[source]¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns

self