ModelInput in transforms

The ModelInput class allows you to load and use models within Python transforms, making it easy to incorporate model inference logic into your data pipelines. To learn more about using models in code workspaces, you can review details on the ModelInput class in Jupyter® Code Workspaces.

Class definition

Copied!
1 2 3 4 5 6 7 8 from palantir_models.transforms import ModelInput ModelInput( alias, # (string) Path or RID of model to load model_version=None, # (Optional) RID of specific model version use_sidecar=False, # (Optional) Run model in separate container sidecar_resources=None # (Optional) Resource configuration for sidecar )

Parameters

ParameterTypeDescriptionVersion / Notes
aliasstrPath or resource ID (RID) of the model resource to load from.
model_versionOptional[str]RID or semantic version of the specific model version to use. If not specified, the latest version will be used.
use_sidecarOptional[bool]When True, runs the model in a separate container to prevent dependency conflicts between the model adapter and transform environment. Note that Lightweight transforms do not support using sidecars.Introduced in palantir_models version 0.1673.0
sidecar_resourcesOptional[Dict[str, Union[float, int]]]Resource configuration for the sidecar container. This parameter can only be used when use_sidecar is set to True.

Supports the following options:
OptionTypeDescription
"cpus"floatNumber of CPUs for the sidecar container
"memory_gb"floatMemory in GB for the sidecar container
"gpus"intNumber of GPUs for the sidecar container
Introduced in palantir_models version 0.1673.0

Usage notes

The code snippet below demonstrates the usage of a model in a transform. The platform will create an instance of the model adapter class that was defined for the model version, giving you access to the methods defined in the adapter. This following example assumes the adapter for the model has a single Pandas input and a single Pandas output DataFrame called output_df, specified in its API. The transform method on the model adapter, which leverages your provided predict method, automatically converts data_in, a TransformInput instance, into the tabular input (either a Spark or Pandas DataFrame) expected by your model adapter as defined in the API.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 from transforms.api import Input, lightweight, Output, transform, TransformInput, TransformOutput from palantir_models import ModelAdapter from palantir_models.transforms import ModelInput # Use Lightweight if the model does not require Spark @lightweight @transform( out=Output('path/to/output'), model_input=ModelInput( "path/to/my/model", # Use specific model version. # The model version can be copied from the left sidebar on the model page. model_version="ri.models.main.model-version.74b03bd6-5715-4904-85f8-4a29499e05a3" ), data_in=Input("path/to/input") ) def my_transform(out: TransformOutput, model_input: ModelAdapter, data_in: TransformInput) -> None: inference_results = model_input.transform(data_in) predictions = inference_results.output_df # Alternatively, you can use the predict method on # a Pandas DataFrame instance directly: # predictions = model_input.predict(data_in.pandas()) out.write_pandas(predictions)

By default, transforms run on Spark, in which case the model adapter instance will be loaded on the driver and require additional logic to distribute work over executors. The only exceptions among the serializers defined in palantir_models is the SparkMLAutoSerializer class designed for Spark ML models. The SparkMLAutoSerializer specifically handles distributing the model to each executor, and results in a model instance that natively runs on executors over Spark DataFrame inputs.

For this reason, we recommend simply using the lightweight decorator for most use cases. If required, and if your model expects a single pandas DataFrame input and output, you can use the DistributedInferenceWrapper to handle the distribution of the model over the executors, as described below.

Importing the adapter code

To instantiate the model adapter class, the environment must have access to the model adapter code. In particular, if the model was created in a different repository, the adapter code, which is packaged alongside the model as a Python library, needs to be imported as a dependency in your repository. The application will prompt you to do this, as shown in the screenshot below.

Import dependencies if the model is from another repository.

Specifying a version

You can specify a particular model version using the model_version parameter. This is especially recommended if the model is not being retrained on a regular schedule as it helps prevent an unintended or problematic model from reaching production. If you do not specify a model version, the system will use the latest model available on the build’s branch by default.

Note that if no version is specified, each transform run will automatically fetch the latest model files for the model input, but it will not automatically update the adapter library version (containing the adapter logic you authored for that version and its Python dependencies) in the repository if the model was generated outside of the repository where it is being used. To update the library version, you will need to select the appropriate adapter version in the repository’s Libraries sidebar and verify that all checks pass. The adapter version corresponding to each model version can be found on the model’s page under Inference configuration.

If this workflow does not suit your needs, consider either using the model within the same repository where it is created or setting use_sidecar to True, as explained below.

Running models as sidecar containers

Running a model in a sidecar container (use_sidecar=True) is recommended for most use cases where the model was built in a code workspace or in a repository different from the one being used for generating predictions.

The main benefit of running the model as a sidecar is that the exact same library versions used to produce the model will also be used to run inference with it. In contrast, importing the adapter code as prompted by the repository user interface will create a new environment solve that merges the constraints from the adapter code and the repository. This may result in different library versions being used.

Additionally, when using a sidecar container to run the model, the adapter code corresponding to the model version being used will automatically be loaded in the sidecar without the user having to manually update the dependency and run checks in the repository.

When using a sidecar, predict() requests are automatically routed to the sidecar container without any additional code changes required:

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 from transforms.api import Input, lightweight, Output, transform, TransformInput, TransformOutput from palantir_models import ModelAdapter # Lightweight is not supported with `use_sidecar` @transform( out=Output('path/to/output'), model_input=ModelInput( "path/to/my/model", ), data_in=Input("path/to/input") ) def my_transform(out: TransformOutput, model_input: ModelAdapter, data_in: TransformInput) -> None: predictions = model_input.transform(data_in) out.write_pandas(predictions)

Specifying resources with the sidecar

The example below will provision a sidecar alongside the driver and executor, each with 1 GPU, 2 CPUs and 4 GB of memory.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 from transforms.api import Input, Output, transform, TransformInput, TransformOutput from palantir_models import ModelAdapter from palantir_models.transforms import ModelInput @transform( out=Output('path/to/output'), model_input=ModelInput( "path/to/my/model", use_sidecar=True, sidecar_resources={ "cpus": 2.0, "memory_gb": 4.0, "gpus": 1 } ), data_in=Input("path/to/input") ) def my_transform(out: TransformOutput, model_input: ModelAdapter, data_in: TransformInput) -> None: ...

Distributed inference using Spark executors

You can run distributed model inference using Spark executors. This approach can be beneficial for batch inference involving computationally heavy models or large datasets, with near-linear scalability.

Consider the following code snippet demonstrating how you can wrap an existing model for distributed inference:

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 from transforms.api import transform, Input, Output, configure from palantir_models.transforms import ModelInput, DistributedInferenceWrapper @transform( input_df=Input("ri.foundry.main.dataset.3cd098b3-aae0-455a-9383-4eec810e0ac0"), model_input=ModelInput("ri.models.main.model.5b758039-370c-4cfc-835e-5bd3f213454c"), output=Output("ri.foundry.main.dataset.c0a3edbc-c917-4f20-88f1-d797ebf27cb2"), ) def compute(ctx, input_df, model_input, output): model_input = DistributedInferenceWrapper(model_input, ctx, 'auto') inference_outputs = model_input.predict(input_df.dataframe()) output.write_dataframe(inference_outputs)

The DistributedInferenceWrapper class is initialized with the following parameters:

ParameterTypeDescriptionNotes
modelModelAdapterThe model adapter instance to be wrapped. This is typically the model_input provided by ModelInput.
ctxTransformContextThe transform context, used to access Spark session information. This is typically the ctx argument of your transform function.
num_partitionsUnion[Literal["auto"], int]Number of partitions to use for the Spark DataFrame. If 'auto', it will be set to match the number of Spark executors. If you experience Out Of Memory (OOM) errors, try increasing this value.Default: 'auto'
max_rows_per_chunkintSpark splits each partition into chunks before sending it to the model. This parameter sets the maximum number of rows allowed per chunk. More rows per chunk means less overhead but more memory usage.Default: 1,000,000

Usage notes:

  • You can configure the number of executors through Spark profiles.
  • The distributed wrapper uses Spark’s user-defined functions (UDFs).
  • The model should have one input and one output dataset, with any number of input parameters.
  • The input and output must be in the Pandas DataFrame format. Spark DataFrame is not supported.
  • Usage of the use_sidecar parameter (described above) is optional.