The Compute Modules feature is in a beta state and may not be available on your enrollments.
You can also view this documentation in the platform within the Compute Modules application for an efficient developer experience.
To get started with compute modules, you can use your preferred developer environment. In a few minutes, you will be able to create and deploy a compute module and test it in Foundry.
In Foundry, choose a folder and select + New > Compute Module, then follow the steps in the dialog to start with an empty compute-module backed function or pipeline. Follow the documentation below for next steps depending on your execution mode, or, for a more seamless experience, select the Documentation tab within your compute module to follow along with in-platform guidance.
In the following sections, we will use the open-source Python library ↗. If you prefer to create your own client or implement your compute module in another language not supported by the SDKs, review the documentation on how to implement the custom compute module client.
Prerequisites:
Dockerfile
in the directory.Dockerfile
:# Change the platform based on your Foundry resource queue
FROM --platform=linux/amd64 python:3.12
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY src .
# USER is required to be non-root and numeric for running compute modules in Foundry
USER 5000
CMD ["python", "app.py"]
requirements.txt
. This file specifies dependencies for our Python application. Copy and paste the following into the file:foundry-compute-modules
src
. This is where we will store our Python application.src
directory, create a file called app.py
.MyComputeModule
├── Dockerfile
├── requirements.txt
└── src
└── app.py
app.py
, copy and paste the following code:Copied!1 2 3 4 5 6 7 8 9
from compute_modules.annotations import function @function def add(context, event): return str(event['x'] + event['y']) @function def hello(context, event): return 'Hello' + event['name']
Learn how to add type inference and automatically register a compute module function with the function registry.
When working with compute module functions, your function will always receive two parameters: event objects and context objects.
Context object: A Python dict
object parameter containing metadata and credentials that your function may need. Examples include user tokens, source credentials, and other necessary data. For example, If your function needs to call the OSDK to get an Ontology object, the context object includes the necessary token for the user to access that Ontology object.
Event object: A Python dict
object parameter containing the data that your function will process. Includes all parameters passed to the function, such as x
and y
in the add function, and name
in the hello
function.
If you use static typing for the event/return object, the library will convert the payload/result into that statically-typed object. Review documentation on automatic function schema inference for more information.
The function result will be wired as a JSON blob, so be sure the function is able to be serialized into JSON.
Now, you can publish your code to Foundry using an Artifact repository, which will be used to store your Docker images.
Compute modules can operate as a connector between inputs and outputs of a data pipeline in a containerized environment. In this example, you will build a simple use case with streaming datasets as inputs and outputs to the compute module, define a function that doubles the input data, and write it to the output dataset. You will use notional data to simulate a working data pipeline.
Dockerfile
in the directory.Dockerfile
:# Change the platform based on your Foundry resource queue
FROM --platform=linux/amd64 python:3.12
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY src .
# USER is required to be non-root and numeric for running compute modules in Foundry
USER 5000
CMD ["python", "app.py"]
requirements.txt
. Store your dependencies for your Python application in this file. For example:requests == 2.31.0
src
. This is where you will put your Python application.src
directory, create a file called app.py
.MyComputeModule
├── Dockerfile
├── requirements.txt
└── src
└── app.py
app.py
:import os
import json
import time
import requests
app.py
, get the bearer token for input and output access:with open(os.environ['BUILD2_TOKEN']) as f:
bearer_token = f.read()
app.py
, get input and output information:with open(os.environ['RESOURCE_ALIAS_MAP']) as f:
resource_alias_map = json.load(f)
input_info = resource_alias_map['identifier you put in the config']
output_info = resource_alias_map['identifier you put in the config']
input_rid = input_info['rid']
input_branch = input_info['branch'] or "master"
output_rid = output_info['rid']
output_branch = output_info['branch'] or "master"
app.py
, interact with inputs and outputs and perform computations. For example:FOUNDRY_URL = "yourenrollment.palantirfoundry.com"
def get_stream_latest_records():
url = f"https://{FOUNDRY_URL}/stream-proxy/api/streams/{input_rid}/branches/{input_branch}/records"
response = requests.get(url, headers={"Authorization": f"Bearer {bearer_token}"})
return response.json()
def process_record(record):
# Assume input stream has schema 'x': Integer
x = record['value']['x']
# Assume output stream has schema 'twice_x': Integer
return {'twice_x': x * 2}
def put_record_to_stream(record):
url = f"https://{FOUNDRY_URL}/stream-proxy/api/streams/{output_rid}/branches/{output_branch}/jsonRecord"
requests.post(url, json=record, headers={"Authorization": f"Bearer {bearer_token}"})
app.py
, run your code as an autonomous task. For example:while True:
records = get_stream_latest_records()
processed_records = list(map(process_record, records))
[put_record_to_stream(record) for record in processed_records]
time.sleep(60)
You can now view the results streamed live in the output dataset.
To interact with inputs and outputs, we provide a bearer token and input/output information.
You can then write code to interact with the inputs and outputs and perform computations. The code snippets provide a simple example of pipelining two stream datasets:
stream-proxy
service.Now, you can publish your code to Foundry using an Artifact repository, which will be used to store your Docker images.