Returns a function that will generate the list of aligned (x,y) points for exactly two time series.
A scatter plot consists of (x,y) coordinates. For two given time series, an (x,y) coordinate will consist of a point from each series where the timestamps match. For points where the underlying series timestamps do not match, the configured interpolation strategy will be used for the series missing a point at that timestamp.
Read about supported interpolation strategies for internal, before and after in interpolate()
Additionally, you can pass a regression function to find the best fit line across the points in the graph.
interpolate()
(default is NONE
)interpolate()
(default is LINEAR
)interpolate()
(default is NONE
)linear_regression()
| polynomial_regression()
| exponential_regression()
, optional) – Output of one of the regression functions, this will provide points for the best fit line (as well as other
related metrics) line between the two input series (defaults to no regression).Column name | Type | Description |
---|---|---|
is_truncated | bool | This field is deprecated and should be ignored. If the output was truncated for a large series. |
points.first_value | float | Value of the point in the first series. |
points.second_value | float | Value of point in the second series. |
points.timestamp | datetime | Timestamp of the points. |
regression.* | float | Columns from the regression function (if regression is used). |
This function is only applicable to numeric series.
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
>>> series_1 = F.points((11, 21.0), (13, 23.0), (15, 25.0), (17, 27.0), name="series-1") >>> series_2 = F.points((11, 21.0), (13, 23.0), (17, 37.0), (37, 47.0), name="series-2") >>> series_1.to_pandas() timestamp value 0 1970-01-01 00:00:00.000000011 21.0 1 1970-01-01 00:00:00.000000013 23.0 2 1970-01-01 00:00:00.000000015 25.0 3 1970-01-01 00:00:00.000000017 27.0 >>> series_2.to_pandas() timestamp value 0 1970-01-01 00:00:00.000000011 21.0 1 1970-01-01 00:00:00.000000013 23.0 2 1970-01-01 00:00:00.000000017 37.0 3 1970-01-01 00:00:00.000000037 47.0 >>> nc = NodeCollection([series_1, series_2])
Copied!1 2 3 4 5 6 7 8 9 10 11 12
>>> scatter_plot = F.scatter( # scatter plot with interpolation ... before="NEAREST", ... internal="LINEAR", ... after="NEAREST", ... )(nc) >>> scatter_plot.to_pandas() is_truncated points.first_value points.second_value points.timestamp 0 False 21.0 21.0 1970-01-01 00:00:00.000000011 1 False 23.0 23.0 1970-01-01 00:00:00.000000013 2 False 25.0 30.0 1970-01-01 00:00:00.000000015 3 False 27.0 37.0 1970-01-01 00:00:00.000000017 4 False 27.0 47.0 1970-01-01 00:00:00.000000037
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13
>>> lin_regression_scatter_plot = F.scatter( ... before="NEAREST", ... internal="LINEAR", ... after="NEAREST", ... regression=F.linear_regression(), ... )(nc) >>> lin_regression_scatter_plot.to_pandas() is_truncated points.first_value points.second_value points.timestamp regression.max_bounds.first_value regression.max_bounds.second_value regression.min_bounds.first_value regression.min_bounds.second_value regression.regression_fit_function.linear_regression_fit.intercept regression.regression_fit_function.linear_regression_fit.slope regression.regression_fit_function.linear_regression_fit.statistics.rsquared 0 False 21.0 21.0 1970-01-01 00:00:00.000000011 27.0 47.0 21.0 21.0 -59.926471 3.720588 0.827161 1 False 23.0 23.0 1970-01-01 00:00:00.000000013 27.0 47.0 21.0 21.0 -59.926471 3.720588 0.827161 2 False 25.0 30.0 1970-01-01 00:00:00.000000015 27.0 47.0 21.0 21.0 -59.926471 3.720588 0.827161 3 False 27.0 37.0 1970-01-01 00:00:00.000000017 27.0 47.0 21.0 21.0 -59.926471 3.720588 0.827161 4 False 27.0 47.0 1970-01-01 00:00:00.000000037 27.0 47.0 21.0 21.0 -59.926471 3.720588 0.827161