foundryts.functions.linear_regression

foundryts.functions.linear_regression(include_intercept=True, time_unit='ns', start=None, end=None)

Returns a function that performs linear regression on a single time series.

Linear regression finds the parameters of the best-fit line over points of the input time series. Linear regression is expressed as y = Ax + B, where A is the slope of the line and B is the y-intercept. The returned function will provide the parameters A and B.

Linear regression is useful when you need to identify and quantify a linear trend in your time series data.

  • Parameters:
    • time_unit (str , optional) – The time unit of the coefficients, must be one of “s”, “ms”, “us”, “ns” (default is “ns”).
    • start (str | int | datetime.datetime , optional) – Starting point (inclusive) of the time series for computing the linear regression.
    • end (str | int | datetime.datetime , optional) – End point (exclusive) of the time series for computing the linear regression.
  • Returns: A function that accepts a single time series and returns parameters for the best-fit line for the points in the time series using linear regression.
  • Return type: (FunctionNode) -> SummarizerNode

Dataframe schema

Column nameTypeDescription
max_bounds.first_valuefloatMaximum value of the slope (A) in y=Ax+B.
max_bounds.second_valuefloatMaximum value of the intercept (B) in y=Ax+B.
min_bounds.first_valuefloatMinimum value of the slope (A) in y=Ax+B.
min_bounds.second_valuefloatMinimum value of the intercept (B) in y=Ax+B.
regression_fit_function.
linear_regression_fit.
slope
floatParameter ‘A’ (slope) of the linear regression fit in
y=Ax+B.
regression_fit_function.
linear_regression_fit.
intercept
floatParameter ‘B’ (intercept) of the linear regression fit in
y=Ax+B.
regression_fit_function.
linear_regression_fit.
statistics.rsquared
floatR-squared value indicating the goodness of fit of the
linear regression.
Note

This function is only applicable to numeric series.

Examples

Copied!
1 2 3 4 5 6 7 8 9 10 >>> series = F.points( ... (10, 6.0), (20, 12.0), (30, 24.0), (40, 48.0), (50, 96.0), name="series" ... ) >>> series.to_pandas() timestamp value 0 1970-01-01 00:00:00.000000010 6.0 1 1970-01-01 00:00:00.000000020 12.0 2 1970-01-01 00:00:00.000000030 24.0 3 1970-01-01 00:00:00.000000040 48.0 4 1970-01-01 00:00:00.000000050 96.0
Copied!
1 2 3 4 >>> lin_regr = F.linear_regression()(series) >>> lin_regr.to_pandas() max_bounds.first_value max_bounds.second_value min_bounds.first_value min_bounds.second_value regression_fit_function.linear_regression_fit.intercept regression_fit_function.linear_regression_fit.slope regression_fit_function.linear_regression_fit.statistics.rsquared 0 50.0 96.0 10.0 6.0 -27.6 2.16 0.870968