skmultiflow.data.
TemporalDataStream
Create a temporal stream from a data source.
TemporalDataStream takes the whole data set containing the X (features), time (timestamps) and Y (targets).
X
time
Y
The features and targets or only the features if they are passed in the y parameter.
y
The timestamp column of each instance. If its a pandas.Series, it will be converted into a numpy.ndarray. If None, delay by number of samples is considered and sample_delay must be int.
Sample weights.
numpy.timedelta64
numpy.ndarray[numpy.datetime64]
pandas.Series
int
The targets.
The column index from which the targets start.
The number of targets.
A list of indices corresponding to the location of categorical features.
A string to id the data.
If True, consider that data, y, and time are already ordered by timestamp. Otherwise, the data is ordered based on time timestamps (time cannot be None).
If True, allows NaN values in the data. Otherwise, an error is raised.
Notes
The stream object provides upon request a number of samples, in a way such that old samples cannot be accessed at a later time. This is done to correctly simulate the stream context.
Methods
get_data_info(self)
get_data_info
Retrieves minimum information from the stream
get_info(self)
get_info
Collects and returns the information about the configuration of the estimator
get_params(self[, deep])
get_params
Get parameters for this estimator.
has_more_samples(self)
has_more_samples
Checks if stream has more samples.
is_restartable(self)
is_restartable
Determine if the stream is restartable.
last_sample(self)
last_sample
Retrieves last batch_size samples in the stream.
n_remaining_samples(self)
n_remaining_samples
Returns the estimated number of remaining samples.
next_sample(self[, batch_size])
next_sample
Get next sample.
prepare_for_use()
prepare_for_use
Prepare the stream for use.
print_df(self)
print_df
Prints all the samples in the stream.
reset(self)
reset
Resets the estimator to its initial state.
restart(self)
restart
Restarts the stream.
set_params(self, **params)
set_params
Set the parameters of this estimator.
Attributes
Return the features’ columns.
cat_features_idx
Get the list of the categorical features index.
data
Return the data set used to generate the stream.
feature_names
Retrieve the names of the features.
n_cat_features
Retrieve the number of integer features.
n_features
Retrieve the number of features.
n_num_features
Retrieve the number of numerical features.
n_targets
Get the number of targets.
target_idx
Get the number of the column where Y begins.
target_names
Retrieve the names of the targets
target_values
Retrieve all target_values in the stream for each target.
Return the targets’ columns.
the features’ columns
List of categorical features index.
Data set.
names of the features
Used by evaluator methods to id the stream.
The default format is: ‘Stream name - n_targets, n_classes, n_features’.
Stream data information
Configuration of the estimator.
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Parameter names mapped to their values.
True if stream has more samples.
True if stream is restartable.
A numpy.ndarray of shape (batch_size, n_features) and an array-like of shape (batch_size, n_targets), representing the next batch_size samples.
The number of integer features in the stream.
The total number of features.
The number of numerical features in the stream.
Remaining number of samples.
If there is enough instances to supply at least batch_size samples, those are returned. If there aren’t a tuple of (None, None) is returned.
The number of instances to return.
Returns the next batch_size instances (sample_x, sample_y, sample_time, sample_delay (if available), sample_weight (if available)). For general purposes the return can be treated as a numpy.ndarray.
batch_size
sample_x
sample_y
sample_time
sample_delay
sample_weight
Deprecated in v0.5.0 and will be removed in v0.7.0
It basically server the purpose of reinitializing the stream to its initial state.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.
<component>__<parameter>
The number of the column where Y begins.
the names of the targets in the stream.
list of lists of all target_values for each target
the targets’ columns