skmultiflow.drift_detection.ADWIN

class skmultiflow.drift_detection.ADWIN(delta=0.002)[source]

Adaptive Windowing method for concept drift detection.

Parameters
deltafloat (default=0.002)

The delta parameter for the ADWIN algorithm.

Notes

ADWIN [1] (ADaptive WINdowing) is an adaptive sliding window algorithm for detecting change, and keeping updated statistics about a data stream. ADWIN allows algorithms not adapted for drifting data, to be resistant to this phenomenon.

The general idea is to keep statistics from a window of variable size while detecting concept drift.

The algorithm will decide the size of the window by cutting the statistics’ window at different points and analysing the average of some statistic over these two windows. If the absolute value of the difference between the two averages surpasses a pre-defined threshold, change is detected at that point and all data before that time is discarded.

References

1

Bifet, Albert, and Ricard Gavalda. “Learning from time-changing data with adaptive windowing.” In Proceedings of the 2007 SIAM international conference on data mining, pp. 443-448. Society for Industrial and Applied Mathematics, 2007.

Examples

>>> # Imports
>>> import numpy as np
>>> from skmultiflow.drift_detection.adwin import ADWIN
>>> adwin = ADWIN()
>>> # Simulating a data stream as a normal distribution of 1's and 0's
>>> data_stream = np.random.randint(2, size=2000)
>>> # Changing the data concept from index 999 to 2000
>>> for i in range(999, 2000):
...     data_stream[i] = np.random.randint(4, high=8)
>>> # Adding stream elements to ADWIN and verifying if drift occurred
>>> for i in range(2000):
...     adwin.add_element(data_stream[i])
...     if adwin.detected_change():
...         print('Change detected in data: ' + str(data_stream[i]) + ' - at index: ' + str(i))

Methods

add_element(self, value)

Add a new element to the sample window.

bucket_size(row)

delete_element(self)

Delete an Item from the bucket list.

detected_change(self)

Detects concept change in a drifting data stream.

detected_warning_zone(self)

If the change detector supports the warning zone, this function will return whether it’s inside the warning zone or not.

get_change(self)

Get drift

get_info(self)

Collects and returns the information about the configuration of the estimator

get_length_estimation(self)

Returns the length estimation.

get_params(self[, deep])

Get parameters for this estimator.

reset(self)

Reset detectors

reset_change(self)

set_clock(self, clock)

set_params(self, **params)

Set the parameters of this estimator.

Attributes

MAX_BUCKETS

estimation

estimator_type

n_detections

total

variance

width

width_t

add_element(self, value)[source]

Add a new element to the sample window.

Apart from adding the element value to the window, by inserting it in the correct bucket, it will also update the relevant statistics, in this case the total sum of all values, the window width and the total variance.

Parameters
value: int or float (a numeric value)

Notes

The value parameter can be any numeric value relevant to the analysis of concept change. For the learners in this framework we are using either 0’s or 1’s, that are interpreted as follows: 0: Means the learners prediction was wrong 1: Means the learners prediction was correct

This function should be used at every new sample analysed.

delete_element(self)[source]

Delete an Item from the bucket list.

Deletes the last Item and updates relevant statistics kept by ADWIN.

Returns
int

The bucket size from the updated bucket

detected_change(self)[source]

Detects concept change in a drifting data stream.

The ADWIN algorithm is described in Bifet and Gavaldà’s ‘Learning from Time-Changing Data with Adaptive Windowing’. The general idea is to keep statistics from a window of variable size while detecting concept drift.

This function is responsible for analysing different cutting points in the sliding window, to verify if there is a significant change in concept.

Returns
bln_changebool

Whether change was detected or not

Notes

If change was detected, one should verify the new window size, by reading the width property.

detected_warning_zone(self)[source]

If the change detector supports the warning zone, this function will return whether it’s inside the warning zone or not.

Returns
bool

Whether the change detector is in the warning zone or not.

get_change(self)[source]

Get drift

Returns
bool

Whether or not a drift occurred

get_info(self)[source]

Collects and returns the information about the configuration of the estimator

Returns
string

Configuration of the estimator.

get_length_estimation(self)[source]

Returns the length estimation.

Returns
int

The length estimation

get_params(self, deep=True)[source]

Get parameters for this estimator.

Parameters
deepboolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsmapping of string to any

Parameter names mapped to their values.

reset(self)[source]

Reset detectors

Resets statistics and adwin’s window.

Returns
ADWIN

self

set_params(self, **params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns
self