Returns a function that will partition a single series into windows and compute statistics over each window.
The series is partitioned into windows using the window arg. Each window contains statistics listed in the Dataframe schema. The statistics are calculated over rolling windows created either based on the periodicity (width of each window) or a fixed number of windows where width is calculated using round(total number of points / number of buckets). The option used for the rolling window is decided by the window or window_count argument is passed.
start (Union [int , datetime , str ] , optional) – Timestamp (inclusive) to start partitioning windows from the provided series. (default is the entire series)
end (Union [int , datetime , str ] , optional) – Timestamp (inclusive) to end partitioning windows from the provided series. (default is the entire series)
window (Union [int , datetime , str ] , optional) –
The timedelta which is the width of each window, and the size of each window is used to divide the series into a : number of windows. (default is the entire series)
**kwargs – Flags for determining the window behavior and the output type.
Column name | Type | Description |
---|---|---|
count | int | Number of data points in the window of the input series. |
earliest_point.timestamp | datetime | Timestamp of the first data point in the window of the input series. |
earliest_point.value | float | Value of the first data point in the window of the input series. |
end_timestamp | datetime | Timestamp of the last data point |
largest_point.timestamp | datetime | Timestamp of the data point with the largest value in the window of the input series. |
largest_point.value | float | Largest value in the window of the input series. |
latest_point.timestamp | datetime | Timestamp of the most recent data point in the window of the input series. |
latest_point.value | float | Value of the most recent data point in the window of the input series. |
mean | float | Average value of all data points in the window of the input series. |
smallest_point.timestamp | datetime | Timestamp of the data point with the smallest value in the window of the input series. |
smallest_point.value | float | Smallest value in the window of the input series. |
start_timestamp | datetime | Timestamp of the first data point |
This function is only applicable to numeric series.
In the future, the include_std_dev kwarg will be deprecated as this feature will be made the default.
window_count can only be used with include_std_dev, and this will override window. If passed without include_std_dev, window_count will be ignored.
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
>>> series = F.points( ... (1, 8.0), ... (101, 4.0), ... (200, 2.0), ... (201, 1.0), ... (299, 35.0), ... (300, 16.0), ... (350, 32.0), ... (1000, 64.0), ... ) timestamp value 0 1970-01-01 00:00:00.000000001 8.0 1 1970-01-01 00:00:00.000000101 4.0 2 1970-01-01 00:00:00.000000200 2.0 3 1970-01-01 00:00:00.000000201 1.0 4 1970-01-01 00:00:00.000000299 35.0 5 1970-01-01 00:00:00.000000300 16.0 6 1970-01-01 00:00:00.000000350 32.0 7 1970-01-01 00:00:00.000001000 64.0
Copied!1 2 3 4 5 6 7 8
>>> stats = F.statistics(window="100ns")(series) # use time-based window >>> stats.to_pandas() count earliest_point.timestamp earliest_point.value end_timestamp largest_point.timestamp largest_point.value latest_point.timestamp latest_point.value mean smallest_point.timestamp smallest_point.value start_timestamp 0 1 1970-01-01 00:00:00.000000001 8.0 1970-01-01 00:00:00.000000100 1970-01-01 00:00:00.000000001 8.0 1970-01-01 00:00:00.000000001 8.0 8.000000 1970-01-01 00:00:00.000000001 8.0 1970-01-01 00:00:00.000000000 1 1 1970-01-01 00:00:00.000000101 4.0 1970-01-01 00:00:00.000000200 1970-01-01 00:00:00.000000101 4.0 1970-01-01 00:00:00.000000101 4.0 4.000000 1970-01-01 00:00:00.000000101 4.0 1970-01-01 00:00:00.000000100 2 3 1970-01-01 00:00:00.000000200 2.0 1970-01-01 00:00:00.000000300 1970-01-01 00:00:00.000000299 35.0 1970-01-01 00:00:00.000000299 35.0 12.666667 1970-01-01 00:00:00.000000201 1.0 1970-01-01 00:00:00.000000200 3 2 1970-01-01 00:00:00.000000300 16.0 1970-01-01 00:00:00.000000400 1970-01-01 00:00:00.000000350 32.0 1970-01-01 00:00:00.000000350 32.0 24.000000 1970-01-01 00:00:00.000000300 16.0 1970-01-01 00:00:00.000000300 4 1 1970-01-01 00:00:00.000001000 64.0 1970-01-01 00:00:00.000001100 1970-01-01 00:00:00.000001000 64.0 1970-01-01 00:00:00.000001000 64.0 64.000000 1970-01-01 00:00:00.000001000 64.0 1970-01-01 00:00:00.000001000
Copied!1 2 3 4 5 6 7 8
>>> stats_with_std_dev = F.statistics(window="100ns", include_std_dev=True)(series) >>> stats_with_std_dev.to_pandas() count earliest_point.timestamp earliest_point.value end_timestamp largest_point.timestamp largest_point.value latest_point.timestamp latest_point.value mean smallest_point.timestamp smallest_point.value standard_deviation start_timestamp 0 1 1970-01-01 00:00:00.000000001 8.0 1970-01-01 00:00:00.000000100 1970-01-01 00:00:00.000000001 8.0 1970-01-01 00:00:00.000000001 8.0 8.000000 1970-01-01 00:00:00.000000001 8.0 0.000000 1970-01-01 00:00:00.000000000 1 1 1970-01-01 00:00:00.000000101 4.0 1970-01-01 00:00:00.000000200 1970-01-01 00:00:00.000000101 4.0 1970-01-01 00:00:00.000000101 4.0 4.000000 1970-01-01 00:00:00.000000101 4.0 0.000000 1970-01-01 00:00:00.000000100 2 3 1970-01-01 00:00:00.000000200 2.0 1970-01-01 00:00:00.000000300 1970-01-01 00:00:00.000000299 35.0 1970-01-01 00:00:00.000000299 35.0 12.666667 1970-01-01 00:00:00.000000201 1.0 15.797327 1970-01-01 00:00:00.000000200 3 2 1970-01-01 00:00:00.000000300 16.0 1970-01-01 00:00:00.000000400 1970-01-01 00:00:00.000000350 32.0 1970-01-01 00:00:00.000000350 32.0 24.000000 1970-01-01 00:00:00.000000300 16.0 8.000000 1970-01-01 00:00:00.000000300 4 1 1970-01-01 00:00:00.000001000 64.0 1970-01-01 00:00:00.000001100 1970-01-01 00:00:00.000001000 64.0 1970-01-01 00:00:00.000001000 64.0 64.000000 1970-01-01 00:00:00.000001000 64.0 0.000000 1970-01-01 00:00:00.000001000
Copied!1 2 3 4 5 6
>>> stats_fixed_window_count = F.statistics(include_std_dev=True, window_count=3)(series) >>> stats_fixed_window_count.to_pandas() count earliest_point.timestamp earliest_point.value end_timestamp largest_point.timestamp largest_point.value latest_point.timestamp latest_point.value mean smallest_point.timestamp smallest_point.value standard_deviation start_timestamp 0 6 1970-01-01 00:00:00.000000001 8.0 1970-01-01 00:00:00.000000335 1970-01-01 00:00:00.000000299 35.0 1970-01-01 00:00:00.000000300 16.0 11.0 1970-01-01 00:00:00.000000201 1.0 11.83216 1970-01-01 00:00:00.000000001 1 1 1970-01-01 00:00:00.000000350 32.0 1970-01-01 00:00:00.000000669 1970-01-01 00:00:00.000000350 32.0 1970-01-01 00:00:00.000000350 32.0 32.0 1970-01-01 00:00:00.000000350 32.0 0.00000 1970-01-01 00:00:00.000000335 2 1 1970-01-01 00:00:00.000001000 64.0 1970-01-01 00:00:00.000001003 1970-01-01 00:00:00.000001000 64.0 1970-01-01 00:00:00.000001000 64.0 64.0 1970-01-01 00:00:00.000001000 64.0 0.00000 1970-01-01 00:00:00.000000669