Returns a function that joins one or more time series into a single time series with a column per input series.
↗ Interpolation estimates the value of missing points in a series for timestamps where the point exists in other series. The function uses the configured interpolation strategies to resample and align the input series data where it misses points.
Interpolation of time series can be divided into three distinct time-ranges: before, internal, and after.
Each time-range handles interpolating missing values in different parts of the time series:
time_extent()
of both series.For applying varying strategies to each series, each strategy for the above time-ranges can be passed as a list. Each list element corresponds to the strategy used for the respective input series. The same strategy is applied to all input series if a single strategy is passed.
Interpolation strategies supported for internal interpolation:
Strategy | Description |
---|---|
LINEAR | Linearly interpolates points using the best fit line for the 2 points immediately before and after the timestamp being interpolated. |
NEAREST | Use the value of the nearest point by timestamp in one of the input series. |
PREVIOUS | Use the value of the previous defined point in the input time series. |
NEXT | Use the value of the next occuring point in the input time series. |
NONE | Skip interpolation. For timestamps where points don’t exist for any of the input series, null values will be used in the output df. |
Interpolation strategies supported for external interpolation (before, after):
Strategy | Description |
---|---|
NEAREST | Take the value of the nearest defined point (this will be either the first or last point). |
NONE (default) | Never interpolate before the first point and beyond the last point. |
An optional frequency can be configured for only interpolating timestamps at the the specified frequency. Providing a frequency completely resamples the input series, only creating and interpolating points at the specified frequency. See the interpolated_every_10ns_series and multiple_interpolated_every_10ns_series examples below for an idea of the resampled output.
NONE
).NONE
).
(default is NONE
).NONE
).Column name | Type | Description |
---|---|---|
timestamp | pandas.Timestamp | Timestamp of the point |
value | Union[float, str] | Value of the point |
Do not use LINEAR interpolation for enum series, or this operation will fail.
The output of this function can be a single-series or a multi-series dataframe which will only work with other functions expecting the respective input.
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
>>> series_1 = F.points((1, 1.0), (101, 2.0), (200, 4.0), (201, 8.0), name="series-1") >>> series_2 = F.points((2, 11.0), (102, 12.0), (201, 14.0), (202, 18.0), name="series-2") >>> series_1.to_pandas() timestamp value 0 1970-01-01 00:00:00.000000001 1.0 1 1970-01-01 00:00:00.000000101 2.0 2 1970-01-01 00:00:00.000000200 4.0 3 1970-01-01 00:00:00.000000201 8.0 >>> series_2.to_pandas() timestamp value 0 1970-01-01 00:00:00.000000002 11.0 1 1970-01-01 00:00:00.000000102 12.0 2 1970-01-01 00:00:00.000000201 14.0 3 1970-01-01 00:00:00.000000202 18.0 >>> nc = NodeCollection([series_1, series_2])
Copied!1 2 3 4 5 6 7 8 9 10
>>> linear_interpolated_series = F.interpolate(internal="LINEAR")(nc) >>> linear_interpolated_series.to_pandas() timestamp series-1 series-2 0 1970-01-01 00:00:00.000000001 1.000000 NaN 1 1970-01-01 00:00:00.000000002 1.010000 11.000000 2 1970-01-01 00:00:00.000000101 2.000000 11.990000 3 1970-01-01 00:00:00.000000102 2.020202 12.000000 4 1970-01-01 00:00:00.000000200 4.000000 13.979798 5 1970-01-01 00:00:00.000000201 8.000000 14.000000 6 1970-01-01 00:00:00.000000202 NaN 18.000000
Copied!1 2 3 4 5 6 7 8 9 10
>>> nearest_interpolated_series = F.interpolate(internal="NEAREST")(nc) >>> nearest_interpolated_series.to_pandas() timestamp series-1 series-2 0 1970-01-01 00:00:00.000000001 1.0 NaN 1 1970-01-01 00:00:00.000000002 1.0 11.0 2 1970-01-01 00:00:00.000000101 2.0 12.0 3 1970-01-01 00:00:00.000000102 2.0 12.0 4 1970-01-01 00:00:00.000000200 4.0 14.0 5 1970-01-01 00:00:00.000000201 8.0 14.0 6 1970-01-01 00:00:00.000000202 NaN 18.0
Copied!1 2 3 4 5 6 7 8 9 10
>>> previous_interpolated_series = F.interpolate(internal="PREVIOUS")(nc) >>> previous_interpolated_series.to_pandas() timestamp series-1 series-2 0 1970-01-01 00:00:00.000000001 1.0 NaN 1 1970-01-01 00:00:00.000000002 1.0 11.0 2 1970-01-01 00:00:00.000000101 2.0 11.0 3 1970-01-01 00:00:00.000000102 2.0 12.0 4 1970-01-01 00:00:00.000000200 4.0 12.0 5 1970-01-01 00:00:00.000000201 8.0 14.0 6 1970-01-01 00:00:00.000000202 NaN 18.0
Copied!1 2 3 4 5 6 7 8 9 10
>>> next_interpolated_series = F.interpolate(internal="NEXT")(nc) >>> next_interpolated_series.to_pandas() timestamp series-1 series-2 0 1970-01-01 00:00:00.000000001 1.0 NaN 1 1970-01-01 00:00:00.000000002 2.0 11.0 2 1970-01-01 00:00:00.000000101 2.0 12.0 3 1970-01-01 00:00:00.000000102 4.0 12.0 4 1970-01-01 00:00:00.000000200 4.0 14.0 5 1970-01-01 00:00:00.000000201 8.0 14.0 6 1970-01-01 00:00:00.000000202 NaN 18.0
Copied!1 2 3 4 5 6 7 8 9 10
>>> none_interpolated_series = F.interpolate(internal="NONE")(nc) # skip any missing points >>> none_interpolated_series.to_pandas() timestamp series-1 series-2 0 1970-01-01 00:00:00.000000001 1.0 NaN 1 1970-01-01 00:00:00.000000002 NaN 11.0 2 1970-01-01 00:00:00.000000101 2.0 NaN 3 1970-01-01 00:00:00.000000102 NaN 12.0 4 1970-01-01 00:00:00.000000200 4.0 NaN 5 1970-01-01 00:00:00.000000201 8.0 14.0 6 1970-01-01 00:00:00.000000202 NaN 18.0
Copied!1 2 3 4 5 6 7 8 9 10
>>> external_interpolated_series = F.interpolate(before="NEAREST", after="NEAREST")(nc) >>> external_interpolated_series.to_dataframe() timestamp series-1 series-2 0 1970-01-01 00:00:00.000000001 1.0 11.0 1 1970-01-01 00:00:00.000000002 NaN 11.0 2 1970-01-01 00:00:00.000000101 2.0 NaN 3 1970-01-01 00:00:00.000000102 12.0 4 1970-01-01 00:00:00.000000200 4.0 NaN 5 1970-01-01 00:00:00.000000201 8.0 14.0 6 1970-01-01 00:00:00.000000202 8.0 18.0
Copied!1 2 3 4 5 6 7 8 9 10
>>> interpolated_series = F.interpolate(internal=["LINEAR", "NONE"])(nc) # different strategies for each series >>> interpolated_series.to_pandas() timestamp series-1 series-2 0 1970-01-01 00:00:00.000000001 1.000000 NaN 1 1970-01-01 00:00:00.000000002 1.010000 11.0 2 1970-01-01 00:00:00.000000101 2.000000 NaN 3 1970-01-01 00:00:00.000000102 2.020202 12.0 4 1970-01-01 00:00:00.000000200 4.000000 NaN 5 1970-01-01 00:00:00.000000201 8.000000 14.0 6 1970-01-01 00:00:00.000000202 NaN 18.0
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
>>> interpolated_every_10ns_series = F.interpolate(internal="NEAREST", frequency="10ns")(series_1) >>> interpolated_every_10ns_series.to_pandas() timestamp series-1 0 1970-01-01 00:00:00.000000010 1.0 1 1970-01-01 00:00:00.000000020 1.0 2 1970-01-01 00:00:00.000000030 1.0 3 1970-01-01 00:00:00.000000040 1.0 4 1970-01-01 00:00:00.000000050 1.0 5 1970-01-01 00:00:00.000000060 2.0 6 1970-01-01 00:00:00.000000070 2.0 7 1970-01-01 00:00:00.000000080 2.0 8 1970-01-01 00:00:00.000000090 2.0 9 1970-01-01 00:00:00.000000100 2.0 10 1970-01-01 00:00:00.000000110 2.0 11 1970-01-01 00:00:00.000000120 2.0 12 1970-01-01 00:00:00.000000130 2.0 13 1970-01-01 00:00:00.000000140 2.0 14 1970-01-01 00:00:00.000000150 2.0 15 1970-01-01 00:00:00.000000160 4.0 16 1970-01-01 00:00:00.000000170 4.0 17 1970-01-01 00:00:00.000000180 4.0 18 1970-01-01 00:00:00.000000190 4.0 19 1970-01-01 00:00:00.000000200 4.0
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
>>> multiple_interpolated_every_10ns_series = F.interpolate( ... internal="NEAREST", frequency="10ns" ... )(nc) >>> multiple_interpolated_every_10ns_series.to_pandas() timestamp series-1 series-2 0 1970-01-01 00:00:00.000000010 1.0 11.0 1 1970-01-01 00:00:00.000000020 1.0 11.0 2 1970-01-01 00:00:00.000000030 1.0 11.0 3 1970-01-01 00:00:00.000000040 1.0 11.0 4 1970-01-01 00:00:00.000000050 1.0 11.0 5 1970-01-01 00:00:00.000000060 2.0 12.0 6 1970-01-01 00:00:00.000000070 2.0 12.0 7 1970-01-01 00:00:00.000000080 2.0 12.0 8 1970-01-01 00:00:00.000000090 2.0 12.0 9 1970-01-01 00:00:00.000000100 2.0 12.0 10 1970-01-01 00:00:00.000000110 2.0 12.0 11 1970-01-01 00:00:00.000000120 2.0 12.0 12 1970-01-01 00:00:00.000000130 2.0 12.0 13 1970-01-01 00:00:00.000000140 2.0 12.0 14 1970-01-01 00:00:00.000000150 2.0 12.0 15 1970-01-01 00:00:00.000000160 4.0 14.0 16 1970-01-01 00:00:00.000000170 4.0 14.0 17 1970-01-01 00:00:00.000000180 4.0 14.0 18 1970-01-01 00:00:00.000000190 4.0 14.0 19 1970-01-01 00:00:00.000000200 4.0 14.0