Function mode | Pipelines module |
---|---|
Use your compute module to host logic as functions. Use these functions across Foundry in applications like Workshop or the Developer Console with the Ontology SDK. | Read Foundry inputs and write to Foundry outputs for streaming and realtime media use cases. This module will be passed as a job token with access you can specify. |
Power your Foundry applications using compute module functions. | Use the Foundry resource permissions system. |
Execute compute module functions from another function. | Get data provenance across Foundry in the Data Lineage application. |
No platform permissions: You will not be provided with access to use Ontology SDK or platform APIs. Application permissions: Your application will use a service user for permissions rather than depending on user permissions.
Foundry job tokens will be attached to the compute module. Job tokens will be scoped to input and output resources and can be used to obtain data.
Functions mode allows you to use your compute module to host logic for use across Foundry, such as in Workshop applications or through the Ontology SDK. You can define and write your logic in any language, register them as functions, and execute this logic with function calls in the platform.
Functions mode can operate through two permission modes:
Application permissions may not be available on all enrollments.
Only users in your organization with permissions to Manage OAuth 2.0 clients
can perform this step. Review the third-party applications documentation for more information.
Add a source with a network policy that enables access to your Foundry environment's URL.
Exchange the client ID and secret for an access token with the desired permissions.
In app.py
, with the compute modules SDK:
from compute_modules.auth import oauth
access_token = oauth("yourenrollment.palantirfoundry.com", ["api:datasets-read"])
Without the compute modules SDK:
import requests
import os
token_response = requests.post("https://yourenrollment.palantirfoundry.com/multipass/api/oauth2/token",
data={
"grant_type": "client_credentials",
"client_id": os.environ["CLIENT_ID"],
"client_secret": os.environ["CLIENT_SECRET"],
"scope": ["api:datasets-read"]
},
headers={
"Content-Type": "application/x-www-form-urlencoded",
},
verify=os.environ["DEFAULT_CA_PATH"]
)
access_token = token_response.json()["access_token"]
In app.py
:
import requests
import os
DATASET_ID = "ri.foundry.main.dataset.7bc5a955-5de4-4c5f-9370-248c5517187b"
dataset_response = requests.get(
f"https://yourenrollment.palantirfoundry.com/api/v1/datasets/{DATASET_ID}",
headers={
"Authorization": f"Bearer {access_token}"
},
verify=os.environ["DEFAULT_CA_PATH"]
)
dataset_name = dataset_response.json()["name"]
print(f"Dataset name is {dataset_name}")
A compute module executed in pipeline mode is designed to facilitate computations for data pipeline workflows that require high data security and provenance control. Pipeline mode works by taking in Foundry inputs, executing user-specified computations, and subsequently producing outputs. The entire process strictly adheres to the protocols and workflows established by the Foundry build system.
Unlike function mode, where users directly interact with a compute module by sending queries, the inputs and outputs and their permissions are managed through the Foundry build system. This ensures that all data involved in the computation process is systematically tracked. By mandating that all inputs and outputs pass through the build system, the module maintains a high level of data integrity and traceability, which is crucial for Foundry data provenance control and security.
Due to provenance control requirements, pipeline mode compute modules are non-interactive, meaning users cannot send queries directly to the compute module. Because of this, the compute module only performs computations on inputs automatically provided by the build system once the compute module is running. The build system also manages the flow of information from a compute module's output. Interfaces are provided for interacting with inputs and outputs inside the container of a compute module running in pipeline mode.
To summarize, pipeline mode enforces data security and provenance control. Users should choose pipeline mode if the following is true:
Pipeline mode compute modules strictly conform to the provenance control and security model established by the Foundry build system. By default, the compute module does not have permission to interact with any Foundry resources. Users must explicitly add Foundry resources as inputs and outputs. Permissions will then be granted on these added resources.
In app.py
:
with open(os.environ['BUILD2_TOKEN']) as f:
bearer_token = f.read()
In app.py
:
with open(os.environ['RESOURCE_ALIAS_MAP']) as f:
resource_alias_map = json.load(f)
input_info = resource_alias_map['identifier you put in the config']
output_info = resource_alias_map['identifier you put in the config']
# structure of resource info
# {
# 'rid': rid
# 'branch': branch (can be none)
# }
input_rid = input_info['rid']
input_branch = input_info['branch'] or "master"
output_rid = output_info['rid']
output_branch = output_info['branch'] or "master"
In app.py
:
FOUNDRY_URL = "yourenrollment.palantirfoundry.com"
url = f"https://{FOUNDRY_URL}/stream-proxy/api/streams/{input_rid}/branches/{input_branch}/records"
response = requests.get(url, headers={"Authorization": f"Bearer {bearer_token}"})