The regression trainer trains a number of models in parallel or sequentially to help it determine the best performing model. This trainer may also take advantage of ensembling if it determines that a weighted ensemble of models performs better than any single model. This trainer is built using AutoGluon's ↗ TabularPredictor
class.
The regression trainer may also perform stacking or bagging to further improve performance. Stacking is an ensembling process where the predictions produced by the set of models trained on the input data are used to train a further "layer" of models, and this process may be repeated. Bagging is an internal optimization method where each model architecture is trained on multiple random samples of the data and then outputs are combined in an ensemble. Stacking and bagging is controlled by the Training preset parameter.
The structure of the model ensemble can be viewed on the output model's experiment page under Plots.
Internally, multiple models will be trained unless disabled by excluding the model or with the use of hyperparameters.
The following types of models are available for training:
While some of these models are strictly classification models, they can still be applied in regression when stacking is used.
best_quality
: Enables stacking and bagging. This should be used when the best possible model is required, even at the cost of training and inference speed.high_quality
: Enables stacking and bagging. Training time and inference time should be faster than the best_quality
preset, but the model may be slightly less accurate.good_quality
: Enables stacking and bagging. This preset has a faster training and inference speed than the previous presets, with decent predictive accuracy.medium_quality
: Disables stacking and bagging. This preset has the fastest training time, but moderate predictive accuracy. This should be used for prototyping.The optional stacking configuration fields allow for deeper control over how the model ensemble is constructed:
WeightedEnsemble
model will be produced at each stacking layer. We recommend enabling this for improved model performance.WeightedEnsemble
of the last stacking layer will be fit with models from all previous layers as base models. This value has no effect when using a preset that disables stacking. We recommend enabling this for improved model performance.WeightedEnsemble
model will be produced on top of the weighted ensembles produced when fit last ensemble with all models is enabled. This option has no effect if fit last ensemble with all models is disabled, or when using a preset that disables stacking.The optional hyperparameter field allows for deeper customization and control over each trained model, with one caveat. When this field is defined, any model not supplied in the hyperparameter field will be ignored. This field will override the default hyperparameters chosen by AutoGluon, and can produce poor results. For this reason, we recommend avoiding this field if you have not consulted the AutoGluon documentation ↗. In the majority of cases, the default hyperparameters provide strong enough results.
In general, the arguments passed here will be sent directly to the underlying model implementation.
Take the example below:
Copied!1 2 3 4 5 6
{ "GBM": [ {"extra_trees": true}, {} ], "NN_TORCH": {} }
These hyperparameters will enforce that the following models are trained:
extra_trees
set to true.With the provided hyperparameters, these are the only models that will be trained before any ensembling or stacking operations are applied.
The regression trainer will output a Foundry model that contains the best model as determined by the validation steps. Details about the model can be accessed by navigating to the experiment, which will contain parameters, metrics, and plots that provide insight into the model's performance.