Code examplesFoundry APIsLocal environment

Local environment

Python

Add resource to project

How can I add a resource to a project using an API call?

This code uses the requests library to send an HTTP POST request to the Compass API, importing a reference to an artifact into a specified project. It includes error handling, logging, and optional proxy configuration.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 from requests.adapters import HTTPAdapter import requests from urllib3 import Retry import logging import json ''' Import a reference to an artifact as a reference to the project specified ''' # Headers headers = { 'Authorization': 'Bearer xxx', # Replace 'xxx' with your bearer token 'Content-Type': 'application/json', } # Host host = 'subdomain.domain.extension:port' # Proxies proxyDict = { 'https': 'protocol://subdomain.domain.extension:port' } # Retries retry = Retry(connect=1, backoff_factor=0.5) adapter = HTTPAdapter(max_retries=retry) http = requests.Session() http.mount('https://', adapter) # Set the level of logging to be shown logging.basicConfig( level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s", handlers=[ logging.FileHandler("debug.log"), logging.StreamHandler() ] ) ############### ## VARIABLES ## ############### RESSOURCE_TO_ADD = "ri......" PROJECT_TO_ADD_TO = "ri.compass.main.folder.xxxx-xxx-xxx-xxx-xxxx" # Throw an error if request fails in some way with information about it try: print(f'Beginning script for ...') # Data source_data = { "requests": [ {"resourceRid": f"{RESSOURCE_TO_ADD}"} ] } # Serialize the JSON data = json.dumps(source_data) response = http.post(f'https://{host}/compass/api/projects/imports/{PROJECT_TO_ADD_TO}/import', data=data, headers=headers, # Uncomment if a proxy is required # proxies=proxyDict ) print('Completed request') print(f'The result of the script is ...') raw_response = response.text print(raw_response) print(response.status_code) except requests.exceptions.RequestException as e: raise Exception( f"An error occurred in the request.\nIt failed due to: {response.status_code} - {response.text}\nException: {e}")
  • Date submitted: 2024-03-26
  • Tags: API, python, compass

Dataset row count

How can I calculate in bulk the number of rows in many datasets?

This code uses the Foundry API to trigger row count computation for a list of dataset RIDs. It sends a POST request to the Foundry Stats API with the dataset RID and branch as parameters.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 from shutil import ExecError from wsgiref import headers import requests from urllib3 import Retry import json import pprint ''' Script will trigger row count computation on the set of provided dataset rids ''' # Base variables base_url = "https://STACK_NAME.palantircloud.com" branch = "master" DATASETS_RIDS = [ "ri.foundry.main.dataset.6d2cd3de-0052-xxxxx-c7ae2c4ab1d8" ] headers = { 'Authorization': 'Bearer eyg_PUT_YOUR_TOKEN_HERE_xxxx', 'Content-Type': 'application/json' } # Proxies proxyDict = { # "https": "https://proxyIfNeeded:port" } # Retries retry = Retry(connect=1, backoff_factor=0.5) adapter = requests.adapters.HTTPAdapter(max_retries=retry) http = requests.Session() http.mount("https://", adapter) def trigger_row_count(dataset_rid, branch): response = http.post(f'{base_url}/foundry-stats/api/stats/datasets/{dataset_rid}/branches/{branch}', headers=headers, proxies=proxyDict) raw_response = response.text curr_response = json.loads(raw_response) pprint.pprint(curr_response) return curr_response for curr_dataset_rid in DATASETS_RIDS : trigger_row_count(curr_dataset_rid, branch)
  • Date submitted: 2024-03-26
  • Tags: export, python, metrics, metadata, local

Get superset of columns across datasets

How can I get the set of all columns across multiple datasets?

This code uses the requests library to fetch the schema of each dataset in a list of target datasets, and then iterates through the fields in the schema to create a dictionary containing the frequency of each column in the superset of columns.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 import time from requests.packages.urllib3.util.retry import Retry from requests.adapters import HTTPAdapter import requests import json import pprint import logging import datetime import collections ''' Script that generates the superset of columns with their frequency from a set of datasets ''' headers = { 'Authorization': 'Bearer eyg_PUT_YOUR_TOKEN_HERE_xxxx', 'Content-Type': 'application/json', } ## STACK_NAME base_url = "STACK_NAME.palantircloud.com" branch = "master" target_datasets = ["ri.foundry.main.dataset.4c2ac089-xxxx-4df863eaf823"] # Proxies proxyDict = { #"https": "https://proxyIfNeeded:port" } # Retries retry = Retry(connect=1, backoff_factor=0.5) adapter = HTTPAdapter(max_retries=retry) http = requests.Session() http.mount("https://", adapter) global_list_fields = {} for curr_dataset in target_datasets : # Get schema of the dataset print(f"Step 1. Get Schema of dataset") response = http.get(f'{base_url}/foundry-metadata/api/schemas/datasets/{curr_dataset}/branches/{branch}', headers=headers, proxies=proxyDict) print(f"Step 1. Response of getting schema of dataset") raw_response = response.text print(raw_response) curr_schema = json.loads(raw_response) list_fields = curr_schema["schema"]["fieldSchemaList"] for field in list_fields: curr_key = f"{field['name']} - {field['type']}" # Increment counter global_list_fields[curr_key] = global_list_fields.get(curr_key, 0) + 1 print("Unsorted dict") pprint.pprint(global_list_fields) # Sort it sorted_dict = {k: v for k, v in sorted(global_list_fields.items(), key=lambda item: item[1])} print("Sorted dict") pprint.pprint(sorted_dict)
  • Date submitted: 2024-03-26
  • Tags: python, API, metadata, code repositories, code authoring, local

OSS direct call

How do I perform aggregations on object sets using the Object Set Service (OSS)?

This code demonstrates how to make direct calls to the Object Set Service (OSS) to perform aggregations on object sets. It is useful for debugging or understanding the data returned by OSS, which is used by other services like OSDK and frontends.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 from requests.adapters import HTTPAdapter import requests from urllib3 import Retry import logging import json import pprint ''' Direct calls to object-set-service (OSS) to perform aggregations, etc. This is not intended for "actual use" in production, but can be useful for debugging or going one layer deeper, for instance to debug or understand what is actually returned by OSS. OSS is used under the hood by other services, like OSDK and frontends. ''' # Headers headers = { 'Authorization': 'Bearer xxx', # Replace 'xxx' with your bearer token 'Content-Type': 'application/json', } # Host host = 'subdomain.domain.extension:port' # Proxies proxyDict = { 'https': 'protocol://subdomain.domain.extension:port' } # Retries retry = Retry(connect=1, backoff_factor=0.5) adapter = HTTPAdapter(max_retries=retry) http = requests.Session() http.mount('https://', adapter) # Set the level of logging to be shown logging.basicConfig( level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s", handlers=[ logging.FileHandler("debug.log"), logging.StreamHandler() ] ) ############### ## VARIABLES ## ############### # Throw an error if request fails in some way with information about it try: print(f'Beginning script for ...') # Data - Example payload to OSS data = { "executionMode":"PREFER_ACCURACY", # OSS will default to "PREFER_SPEED" which can provide inaccurate results "objectSet": { "base": { "objectTypeId": "af-20m-instances-obv2" # The object instance id }, "type": "base" }, "aggregation": { "metrics": {}, "subAggregations": { "test": { "type": "metrics", "metrics": { "dimension": { "type": "propertyValue", "propertyValue": { "propertyId": "example_bucket", # The property of the object to aggregate by "bucketing": { "type": "exactValue", "exactValue": { "maxBuckets": 10 # The number of buckets the response should contain } } } }, "ordering": [ { "type": "valueOrdering", "valueOrdering": { "direction": "DESCENDING", "metricName": "countM" } } ], "metrics": { "countM": { "type": "count", "count": {} } } } } } } } # Serialize the JSON data = json.dumps(source_data) response = http.put(f'https://{host}/object-set-service/api/aggregate', data=data, headers=headers, # Uncomment if a proxy is required # proxies=proxyDict ) print('Completed request') print(f'The result of the script is ...') pprint.pprint(response.json()) except requests.exceptions.RequestException as e: raise Exception(f"An error occurred in the request.\nIt failed due to: {response.status_code} - {response.text}\nException: {e}")
  • Date submitted: 2024-03-26
  • Tags: ontology, aggregation, objects, python, API, local

Ping Foundry: No token required

How can I ping my Foundry instance or account without an authentication token?

This code demonstrates how to use the Python requests library to send a ping request to a Palantir Cloud Stack with a specified proxy and retry settings.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 from requests.packages.urllib3.util.retry import Retry from requests.adapters import HTTPAdapter import requests headers = { # On principle : no token required ! 'Authorization': 'Bearer eyg_PUT_YOUR_TOKEN_HERE_xxxx', 'Content-Type': 'application/json', } ## STACK base_url = "https://STACK_NAME.palantircloud.com" # Proxies proxyDict = { "https": "http://proxy.host.com:3333" } # Retries retry = Retry(connect=1, backoff_factor=0.5) adapter = HTTPAdapter(max_retries=retry) http = requests.Session() http.mount("https://", adapter) print("Pinging ... ") response = http.get(f"{base_url}/compass/api/ping", headers=headers, proxies=proxyDict) print("Ping response :") raw_response = response.text print(raw_response)
  • Date submitted: 2024-03-26
  • Tags: API, python, compass, local

Get the path for a given resource RID

How can I find the path of a resource from its RID?

This code uses the requests library to send an HTTP GET request to the specified host with the given RID, and retrieves the path of the resource. It also handles retries and proxies.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 from requests.adapters import HTTPAdapter import requests from urllib3 import Retry ''' Script to return the path of a given Resource IDentifier (RID). ''' # Headers headers = { 'Authorization': 'Bearer xxx', # Replace 'xxx' with your bearer token 'Content-Type': 'application/json', } # Host host = 'host.com:443' # Proxies proxyDict = { 'https': 'http://proxy.domain.com:3333' } # Retries retry = Retry(connect=1, backoff_factor=0.5) adapter = HTTPAdapter(max_retries=retry) http = requests.Session() http.mount('https://', adapter) # Enter the rid of the resource you want the path of RESOURCE_RID = '' # Throw an error if the reference has failed to be created try: print(f'Fetching path for rid {RESOURCE_RID} ...') response = http.get(f'https://{host}/compass/api/resources/{RESOURCE_RID}/path-json', headers=headers, proxies=proxyDict) print('Completed request') print(f'The path is: {response.text}') except requests.exceptions.RequestException as e: raise Exception(f"An error occurred in the request.\nReturning the path for the repository: {RESOURCE_RID} failed due to: {response.status_code} - {response.text}\nException: {e}")
  • Date submitted: 2024-03-26
  • Tags: api, python, metadata, local

Trigger action using API

How can I manually trigger an action on an object?

This code uses the requests library to send HTTP requests to the actions API, iterating over a list of IDs and triggering an action for each ID with custom parameters.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 from requests.packages.urllib3.util.retry import Retry from requests.adapters import HTTPAdapter import pprint import uuid import requests import json import time ''' Script that will trigger an action. ''' headers = { 'Authorization': 'Bearer eyg_PUT_YOUR_TOKEN_HERE_xxxx', 'Content-Type': 'application/json' } # Name of the stack STACK = "STACK_NAME.palantircloud.com" # List of ids, could be any list of parameters you want to iterate on list_ids = ["123", "456"] # Proxies proxyDict = { "https": "https://proxyIfNeeded:port" } # Retries retry = Retry(connect=1, backoff_factor=0.5) adapter = HTTPAdapter(max_retries=retry) http = requests.Session() http.mount("https://", adapter) # Iterate over the list of ids and trigger one action per ID. for curr_id in list_ids: curr_uuid = "GENERATED-OBJECT-" + str(uuid.uuid4()) # To generate a uuid curr_title = "GENERATED-" + str(time.time()) # To generate a timestamp user_rid = "xxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx" # if a user rid is needed try: # Generate the payload of the action. Look at a network tab from slate/Workshop to obtain it - or build it from scratch. payload = r'{"actionTypeRid":"ri.actions.main.action-type.xxxxx-xxxx-xxxx-xxxxxxxx",' \ r'"parameters":{"ri.actions.main.parameter.xxxxx-xxxx-xxxx-xxxxxxxx":{"timestamp":"2021-09-30T23:59:59+02:00","type":"timestamp"},' \ r'"ri.actions.main.parameter.xxxxx-xxxx-xxxx-xxxxxxxx":{"stringList":{"strings":["mystring1","mystring2"]},"type":"stringList"},' \ r'"ri.actions.main.parameter.xxxxx-xxxx-xxxx-xxxxxxxx":{"string":"my_string","type":"string"},' \ r'"ri.actions.main.parameter.xxxxx-xxxx-xxxx-xxxxxxxx":{"timestamp":"2022-01-01:23:26+00:00","type":"timestamp"}}' response = http.post(f'https://{STACK}/actions/api/actions', headers=headers, proxies=proxyDict, data=payload) print(f"Raw response of action call with : {curr_id}\r\n") raw_response = response.json() pprint.pprint(raw_response, indent=4) except Exception as e: print(e)
  • Date submitted: 2024-03-26
  • Tags: action, objects, ontology, actions on objects, python, api, local

Upload local file to dataset

What API can I use to upload a local file to a dataset in Foundry?

This code uses the Foundry API to upload a file to a specified dataset. It sets up headers, host, proxies, and retries for the request, and then reads the file and sends it as a POST request to the dataset's files endpoint.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 from requests.adapters import HTTPAdapter import requests from urllib3 import Retry import logging import json # Headers headers = { 'Authorization': 'Bearer xxx', # Replace 'xxx' with your bearer token 'Content-type': 'application/octet-stream', ### IMPORTANT ! } # Host host = 'subdomain.domain.extension:port' # Proxies proxyDict = { 'https': 'protocol://subdomain.domain.extension:port' } # Retries retry = Retry(connect=1, backoff_factor=0.5) adapter = HTTPAdapter(max_retries=retry) http = requests.Session() http.mount('https://', adapter) # Set the level of logging to be shown logging.basicConfig( level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s", handlers=[ logging.FileHandler("debug.log"), logging.StreamHandler() ] ) ############### ## VARIABLES ## ############### # Dataset RID to upload to datasetRid = "rid.123..." params = { 'filePath': 'folder_name/my-file.csv', } # Throw an error if request fails in some way with information about it try: print(f'Beginning script for one-call upload') # Data with open('./data_example_file.csv') as f: data = f.read().replace('\n', '').replace('\r', '').encode() response = http.post(f'https://{host}/api/v1/datasets/{datasetRid}/files:upload', params=params, data=data, headers=headers, # Uncomment if a proxy is required # proxies=proxyDict ) print('Completed request') print(f'The result of the script is {response.status_code} - {response.text}') except requests.exceptions.RequestException as e: raise Exception( f"An error occurred in the request.\nIt failed due to: {response.status_code} - {response.text}\nException: {e}")
  • Date submitted: 2024-04-04
  • Tags: API, file upload, dataset, python, csv, local