skmultiflow.data.TemporalDataStream¶

class
skmultiflow.data.
TemporalDataStream
(data, y=None, time=None, sample_weight=None, sample_delay=0, target_idx= 1, n_targets=1, cat_features=None, name=None, allow_nan=False, ordered=True)[source]¶ Create a temporal stream from a data source.
TemporalDataStream takes the whole data set containing the
X
(features),time
(timestamps) andY
(targets). Parameters
 data: numpy.ndarray or pandas.DataFrame
The features and targets or only the features if they are passed in the
y
parameter. time: numpy.ndarray(dtype=datetime64) or pandas.Series (Default=None)
The timestamp column of each instance. If its a pandas.Series, it will be converted into a numpy.ndarray. If None, delay by number of samples is considered and sample_delay must be int.
 sample_weight: numpy.ndarray or pandas.Series, optional (Default=None)
Sample weights.
 sample_delay: numpy.ndarray, pandas.Series, numpy.timedelta64 or int, optional (Default=0)
 Options per data type used:
numpy.timedelta64
: Samples delay in time, the timeoffset between the event time and when the label is available, e.g., numpy.timedelta64(1,”D”) for a 1day delay)numpy.ndarray[numpy.datetime64]
: array with the timestamps when each sample will be availablepandas.Series
: series with the timestamps when each sample will be availableint
: the delay in number of samples.  y: numpy.ndarray or pandas.DataFrame, optional (Default=None)
The targets.
 target_idx: int, optional (default=1)
The column index from which the targets start.
 n_targets: int, optional (default=1)
The number of targets.
 cat_features: list, optional (default=None)
A list of indices corresponding to the location of categorical features.
 name: str, optional (default=None)
A string to id the data.
 ordered: bool, optional (default=True)
If True, consider that data, y, and time are already ordered by timestamp. Otherwise, the data is ordered based on time timestamps (time cannot be None).
 allow_nan: bool, optional (default=False)
If True, allows NaN values in the data. Otherwise, an error is raised.
Notes
The stream object provides upon request a number of samples, in a way such that old samples cannot be accessed at a later time. This is done to correctly simulate the stream context.
Methods
get_data_info
(self)Retrieves minimum information from the stream
get_info
(self)Collects and returns the information about the configuration of the estimator
get_params
(self[, deep])Get parameters for this estimator.
has_more_samples
(self)Checks if stream has more samples.
is_restartable
(self)Determine if the stream is restartable.
last_sample
(self)Retrieves last batch_size samples in the stream.
n_remaining_samples
(self)Returns the estimated number of remaining samples.
next_sample
(self[, batch_size])Get next sample.
Prepare the stream for use.
print_df
(self)Prints all the samples in the stream.
reset
(self)Resets the estimator to its initial state.
restart
(self)Restarts the stream.
set_params
(self, **params)Set the parameters of this estimator.
Attributes
Return the features’ columns.
Get the list of the categorical features index.
Return the data set used to generate the stream.
Retrieve the names of the features.
Retrieve the number of integer features.
Retrieve the number of features.
Retrieve the number of numerical features.
Get the number of targets.
Get the number of the column where Y begins.
Retrieve the names of the targets
Retrieve all target_values in the stream for each target.
Return the targets’ columns.

property
X
¶ Return the features’ columns.
 Returns
 np.ndarray:
the features’ columns

property
cat_features_idx
¶ Get the list of the categorical features index.
 Returns
 list:
List of categorical features index.

property
data
¶ Return the data set used to generate the stream.
 Returns
 pd.DataFrame:
Data set.

property
feature_names
¶ Retrieve the names of the features.
 Returns
 list
names of the features

get_data_info
(self)[source]¶ Retrieves minimum information from the stream
Used by evaluator methods to id the stream.
The default format is: ‘Stream name  n_targets, n_classes, n_features’.
 Returns
 string
Stream data information

get_info
(self)[source]¶ Collects and returns the information about the configuration of the estimator
 Returns
 string
Configuration of the estimator.

get_params
(self, deep=True)[source]¶ Get parameters for this estimator.
 Parameters
 deepboolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
 Returns
 paramsmapping of string to any
Parameter names mapped to their values.

has_more_samples
(self)[source]¶ Checks if stream has more samples.
 Returns
 Boolean
True if stream has more samples.

is_restartable
(self)[source]¶ Determine if the stream is restartable.
 Returns
 Bool
True if stream is restartable.

last_sample
(self)[source]¶ Retrieves last batch_size samples in the stream.
 Returns
 tuple or tuple list
A numpy.ndarray of shape (batch_size, n_features) and an arraylike of shape (batch_size, n_targets), representing the next batch_size samples.

property
n_cat_features
¶ Retrieve the number of integer features.
 Returns
 int
The number of integer features in the stream.

property
n_features
¶ Retrieve the number of features.
 Returns
 int
The total number of features.

property
n_num_features
¶ Retrieve the number of numerical features.
 Returns
 int
The number of numerical features in the stream.

n_remaining_samples
(self)[source]¶ Returns the estimated number of remaining samples.
 Returns
 int
Remaining number of samples.

property
n_targets
¶ Get the number of targets.
 Returns
 int:
The number of targets.

next_sample
(self, batch_size=1)[source]¶ Get next sample.
If there is enough instances to supply at least batch_size samples, those are returned. If there aren’t a tuple of (None, None) is returned.
 Parameters
 batch_size: int
The number of instances to return.
 Returns
 tuple or tuple list
Returns the next
batch_size
instances (sample_x
,sample_y
,sample_time
,sample_delay
(if available),sample_weight
(if available)). For general purposes the return can be treated as a numpy.ndarray.

static
prepare_for_use
()[source]¶ Prepare the stream for use.
Deprecated in v0.5.0 and will be removed in v0.7.0

restart
(self)[source]¶ Restarts the stream.
It basically server the purpose of reinitializing the stream to its initial state.

set_params
(self, **params)[source]¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object. Returns
 self

property
target_idx
¶ Get the number of the column where Y begins.
 Returns
 int:
The number of the column where Y begins.

property
target_names
¶ Retrieve the names of the targets
 Returns
 list
the names of the targets in the stream.

property
target_values
¶ Retrieve all target_values in the stream for each target.
 Returns
 list
list of lists of all target_values for each target

property
y
¶ Return the targets’ columns.
 Returns
 np.ndarray:
the targets’ columns