Migrate from foundry_ml to palantir_models

The Python library foundry_ml has started its sunset period. The foundry_ml library will be deprecated on October 31, 2025, corresponding with the planned deprecation of Python 3.9. In its place, we recommend using the palantir_models framework to develop, test, and serve models in the platform.

Models trained with foundry_ml need to be updated to use the palantir_models framework by October 31, 2025. Models developed with foundry_ml will not be supported in modeling objectives, Python transforms, or modeling objective deployments. Models built with foundry_ml in Code Repositories need to be rebuilt in a new code repository initialized with the Model Training template. Models built with foundry_ml in Code Workbooks need to be rebuilt in Jupyter Code Workspaces. For guidance on building a new model with palantir_models, review how to train a model in Code Repositories or how to train a model in Jupyter notebook.

In this tutorial, we will migrate a model built with foundry_ml in Code Repositories into palantir_models. We will provide code for building the model using foundry_ml, and demonstrate how to rewrite this code using palantir_models.

Example model built with foundry_ml in Code Repositories

In the following snippet, we have authored a scikit-learn linear regression model using foundry_ml, which we will migrate to palantir_models:

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 from transforms.api import transform, Input, Output from foundry_ml import Model, Stage from sklearn.linear_model import LinearRegression from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler @transform( training_data_input=Input("<YOUR_PROJECT_PATH>/data/housing_train_data"), model_output=Output("<YOUR_PROJECT_PATH>/models/linear_regression_foundry_ml"), ) def create_model(training_data_input, model_output): training_df = training_data_input.pandas() numeric_features = ['median_income', 'housing_median_age', 'total_rooms'] pipeline = Pipeline([ ("scaler", StandardScaler()), ("regressor", LinearRegression())]) X_train = training_df[numeric_features] y_train = training_df['median_house_value'] pipeline.fit(X_train, y_train) model = Model(Stage(pipeline["scaler"], output_column_name="features"), Stage(pipeline["regressor"])) model.save(model_output)

Migrate from foundry_ml to palantir_models in Code Repositories

To migrate the above code from foundry_ml to palantir_models, follow the steps outlined below.

Step 1: Open a new code repository with the Model Training template

The Palantir platform provides a templated repository for machine learning called the Model Training template. To access it in Code Repositories, first select Models when asked What are you building?:

Choose a Model Type of Code Repository.

Select Model Training as the repository type:

Initialize the Model Training Template.

The Model Training template contains the example structure that we will adapt for this tutorial. You can expand the files on the left side to see an example project:

Empty Model Training template.

Step 2: Author a model adapter

Model adapters provide a standard interface for all models in Foundry. This standard interface ensures that all models can be used immediately in production applications. The Palantir platform will handle the infrastructure to load the model and its Python dependencies, interface with it, and expose its API.

To enable this, you must create an instance of a ModelAdapter class to act as this communication layer.

There are 3 functions to implement:

  1. Model save and load: In order to reuse your model, you need to define how your model should be saved and loaded. Palantir provides many default methods of serialization (saving), and in more complex cases you can implement custom serialization logic.
  2. API: Defines the model's API and tells the Palantir platform what type of input data your model requires.
  3. Predict: Called by the Palantir platform to provide data to your model. This is where you can pass input data to the model and generate inferences (predictions).

Open the model_adapters/adapter.py file and author the model adapter:

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 import palantir_models as pm from palantir_models_serializers import DillSerializer class SklearnRegressionAdapter(pm.ModelAdapter): @pm.auto_serialize( model=DillSerializer() ) def __init__(self, model): self.model = model @classmethod def api(cls): columns = [ ('median_income', float), ('housing_median_age', float), ('total_rooms', float) ] return {'df_in': pm.Pandas(columns=columns)}, \ {'df_out': pm.Pandas(columns=columns + [('prediction', float)])} def predict(self, df_in): df_in['prediction'] = self.model.predict(df_in[['median_income', 'housing_median_age', 'total_rooms']]) return df_in

For more information about model adapter logic, refer to author a model adapter.

Step 3: Write the model training logic

In the example below, the train_model function contains an unchanged example of the training logic from foundry_ml. The compute function wraps the model with the model adapter, and publishes the model.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 from transforms.api import transform, Input from palantir_models.transforms import ModelOutput from main.model_adapters.adapter import SklearnRegressionAdapter from sklearn.linear_model import LinearRegression from sklearn.pipeline import Pipeline from sklearn.preprocessing import StandardScaler @transform( training_data_input=Input("<YOUR_PROJECT_PATH>/data/housing_train_data"), model_output=ModelOutput("<YOUR_PROJECT_PATH>/models/linear_regression_model"), ) def compute(training_data_input, model_output): training_df = training_data_input.pandas() # training logic model = train_model(training_df) # wrap model with model adapter foundry_model = SklearnRegressionAdapter(model) # publish the model model_output.publish(model_adapter=foundry_model) def train_model(training_df): ''' Training logic is unchanged from the original foundry_ml example. ''' numeric_features = ['median_income', 'housing_median_age', 'total_rooms'] pipeline = Pipeline([ ("scaler", StandardScaler()), ("regressor", LinearRegression())]) X_train = training_df[numeric_features] y_train = training_df['median_house_value'] pipeline.fit(X_train, y_train) return pipeline

Open the model_training/model_training.py file in your repository and author the compute function for your model. Copy your model's machine learning logic into the train_model function. Update paths to correctly point to the training dataset and model. Select Build at the top right to run the code.

Step 4: Write the model inference logic

Open the model_training/run_inference.py file in your repository and author the model inference logic. Update paths to correctly point to the model and test dataset. Select Build at the top right to run the code.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 from transforms.api import transform, Input, Output from palantir_models.transforms import ModelInput @transform( testing_data_input=Input("<YOUR_PROJECT_PATH>/data/housing_test_data"), model_input=ModelInput("<YOUR_PROJECT_PATH>/models/linear_regression_model_asset"), predictions_output=Output("<YOUR_PROJECT_PATH>/data/housing_testing_data_inferences") ) def compute(testing_data_input, model_input, predictions_output): inference_outputs = model_input.transform(testing_data_input) predictions_output.write_pandas(inference_outputs.df_out)

Following these steps should successfully migrate an existing foundry_ml model in Code Repositories to palantir_models.