Search
Palantir
Documentation
Documentation
Apollo
Gotham
Search documentation
Search
karat
+
K
API Reference ↗
Send feedback
en
en
jp
kr
zh
AB
XY
AB
XY
AB
XY
AB
XY
AB
XY
AB
XY
AB
XY
Capabilities
Data connectivity & integration
Model connectivity & development
Ontology building
Developer toolchain
Use case development
Analytics
Product delivery
Security & governance
Management & enablement
Getting started
Platform overview
Platform updates
Announcements
Release notes
Data connectivity & integration
Overview
Connecting to data
What is a data pipeline?
Application reference
Core concepts
Datasets
Streams
Media sets (unstructured data)
Branching
Builds
Schedules
Health checks
Virtual tables
Change data capture (CDC)
Views
Release notes ↗
Connect to data
Data Connection
Overview
Core concepts
Architecture
Initial setup overview
FAQ
Direct connections, agents, and agent proxies
Set up a direct connection
Set up an agent
Agent configuration reference
Agent worker runtime configuration reference
Agent proxy runtime configuration reference
Troubleshooting reference
OpenID Connect (OIDC) authentication
Sources
Set up a source
Source exploration
Syncs
Set up a sync
Set up a streaming sync
File-based syncs
Set up a media set sync
Optimize JDBC syncs
Troubleshooting reference
Push data into a stream
Exports
Exports
Export tasks (legacy)
Webhooks
Overview
Set up a Webhook
Configuration reference
Permissions reference
Add sync to Marketplace product [Beta]
Add virtual table to Marketplace product [Beta]
HyperAuto (SDDI)
Overview
Architecture
Supported sources
Getting started
Making edits via proposals
Configuration options
Folder-based SAP pipelines
AIP features
HyperAuto V1
Overview
Getting started
Source exploration
Cockpit
Configuration reference
Migrating from HyperAuto V1 to V2
FAQ
External transforms
Source-based external transforms
External functions
Connector type reference
Overview
Available connectors
Act! CRM
Act-On
ActiveCampaign
Acumatica
Adobe Analytics
Adobe Commerce
ADP
Agent-level filesystem
Airtable
AlloyDB
Amazon DynamoDB
Amazon Kinesis
Amazon Marketplace
Amazon S3
Apache CouchDB
Apache HBase
Apache Hive
Apache Phoenix
Asana
Authorize.Net
Avalara
Azure Active Directory
Azure Blob Filesystem (ABFS)
Azure Cosmos DB
Azure Data Catalog
Azure DevOps
Azure Synapse
Azure Table Storage
Basecamp
BigCommerce
BigQuery
Blackbaud Raisers Edge NXT
Bugzilla
Bullhorn CRM
Cassandra
Certinia
Cloudant
CockroachDB
Confluence
Couchbase
Databricks
DB2
DocuSign
Domino
eBay
eBay Analytics
EnterpriseDB
Exact Online
Facebook
Facebook Ads
FreshBooks
Freshdesk
FTP/FTPS
GitHub
Generic connector
Gmail
Google Campaign Manager
Google Cloud Storage
Google Contacts
Google Data Catalog
Google Directory
Google Drive
Google Pub/Sub
Google Search
Google Spanner
GraphQL
Greenplum
HDFS
Highrise
Hubspot
IBM Cloud Data Engine
IBM Cloud Object Storage
Instagram
JDBC (custom)
Jira Service Management
Kafka
Kintone
LDAP
LinkedIn
LinkedIn Marketing Solutions
Mailchimp
Marketo
MarkLogic
Microsoft Access
Microsoft Ads
Microsoft Bing
Microsoft Dataverse
Microsoft Dynamics 365
Microsoft Dynamics CRM
Microsoft Dynamics NAV
Microsoft Excel
Microsoft Excel Online
Microsoft Exchange
Microsoft Office 365
Microsoft OneDrive
Microsoft OneNote
Microsoft Planner
Microsoft Power BI XMLA
Microsoft Project
Microsoft SQL Server
Microsoft Teams
Monday
MYOB
NoSQL stores
OData
Odoo
Oracle Eloqua
Oracle Fusion Cloud Financials
Oracle Fusion Cloud HCM
Oracle Fusion Cloud SCM
Oracle NetSuite
Overview
NetSuite SuiteAnalytics
NetSuite SuiteQL (JDBC)
NetSuite SuiteTalk (JDBC)
Oracle Sales
Oracle Service Cloud
Outreach
Paylocity
PayPal
Pinterest
Pipedrive
PostgreSQL
Presto
Quickbase
QuickBooks Desktop
QuickBooks Online
QuickBooks POS
Raisers Edge NXT
Reckon
Reckon Accounts Hosted
Redis
REST API plugin (legacy)
REST APIs
Sage 200
Sage 300
Sage 50 UK
Sage Business Cloud Accounting
Salesforce
Salesforce Marketing Cloud
Salesforce Marketing Cloud Account Engagement
Salesloft
SAP Business One
SAP BusinessObjects BI
SAP ByDesign
SAP Cloud for Customer
SAP Concur
SAP Fieldglass
SAP HANA XSA
SAP SuccessFactors
SendGrid
SFTP
SharePoint Online
ShipStation
Shopify
SingleStore
Slack [Beta]
Smartsheet
SMB
Snapchat Ads
Snowflake
Spark SQL
Splunk
Square
Streak
Stripe
SugarCRM
SuiteCRM
SurveyMonkey
SybaseIQ
Tableau CRM Analytics
Tally
TaxJar
Trello
TSheets
Twilio
Twitter Ads
Veeva Vault
Wave Financial
WooCommerce
WordPress
Xero
Xero WorkflowMax
YouTube Analytics
Zendesk
Zoho Books
Zoho Creator
Zoho CRM
Zoho Inventory
Zoho Projects
Zuora
Other source types
Palantir-provided drivers for JDBC sources
SAP
Overview
Architecture
Download
SAP Add-on
Installation overview
Install the Connector
Install a Remote Agent
Install a Remote Agent for 4.6C/620/640
Install a Support Package
Install a Fix Pack
Configure SAP SLT
Create an RFC connection
Uninstall the Connector or Remote Agent
Add-on Cockpit
Add-on parameters
Add-on housekeeping
Add-on authorization roles
Add-on backup and restore
Foundry SAP Setup
Create a new source
Source exploration
Create a new sync
Create a new streaming sync
Incremental syncs
Supported sync types
Dynamic filters
How-Tos
Extract long text from SAP
Configure custom authorizations and role management
Configure BEx query extraction
Configure extractors
Configure function module extraction
Configure transaction code and report extraction
Ingest HANA views from SAP
User-attributed SAP writeback with OAuth 2.0
FAQ
Workflows
Resource guides
Streaming
Flink fundamentals
Reset stream
Stream monitoring
Streaming profiles
Building pipelines
Overview
Types of pipelines
Supported languages
Considerations: Pipeline Builder and Code Repositories
Getting started
Create a dataset batch pipeline with Pipeline Builder
Create a media set batch pipeline with Pipeline Builder
Create a dataset batch pipeline with Code Repositories
Create a media set batch pipeline with Code Repositories
Create an incremental pipeline with Pipeline Builder
Create a streaming pipeline with Pipeline Builder
Incremental pipelines
Overview
Creating incremental syncs
Maintaining high performance
Streaming pipelines
Overview
Comparison: Streaming vs. batch
Performance considerations
Streaming compute usage
Keys
Streaming stateful transforms
Scheduling
Overview
Create a schedule
View and modify schedules
Find and manage schedules
Common scheduling configurations
Trigger types reference
Troubleshooting reference
Add schedule to Marketplace product [Beta]
Logic flows [Sunset]
Overview
Create a connected flow
Compass File Lister
Best practices
Recommended Project and team structure
Development best practices
Branching and release process
Scheduling best practices
Building a production pipeline
Pipelines on unstructured data
Overview
Infer a schema for CSV or JSON files
Security in pipelines
Overview
Guidance on removing markings
Remove inherited markings and organizations
Optimizing and debugging pipelines
Overview
Debugging pipelines
Debug a failing job
Debug a failing pipeline
Debug a failing stream
Troubleshoot out-of-memory errors
Troubleshooting schedules
Spark
Core concepts
Understand Spark details
Spark UI
Understand compute usage
Native acceleration
Apply Spark profiles
Spark profiles reference
Dataset projections
Overview
Set up a projection
Advanced details
Optimizing pipelines
Usage optimization
Maintaining pipelines
Overview
Stability recommendations
Recommended health checks
Define data expectations
Recommended support processes
Monitoring at scale
Monitoring rules reference
Sending alerts to external systems
Applications
Pipeline Builder
Overview
Core concepts
Navigation
AIP features
Functions Index
Input datasets
Overview
Add datasets
Add generated data
Configure sources and dataset syncs
Computation modes for batch
Transforms
Overview
Transform data
Join data
Union data
Create geospatial transforms
Create unique IDs
Joins in streaming pipelines
Use LLM node
Pattern mining
Pipeline outputs
Overview
Add a dataset output
Add an Ontology output
Add a geotemporal series sync output
Preview pipeline
Deliver pipeline
Remove Markings on outputs
Breaking changes
Pipeline management
Overview
Add an input sampling strategy
Pipeline parameters
Build settings
Create custom functions
Show and hide nodes
Folders in Pipeline Builder
Color groups
Checkpoints
Job groups
Export pipeline code
Branches
Overview
Create a branch
Propose a change
Approve a change
Branch protection
Fallback branches
Schedules
Overview
Create a schedule
Create a schedule with AIP
Data expectations
Overview
Configure data health checks
Configure unit tests
Add pipeline to Marketplace product [Beta]
Pipeline Builder Expressions
Absolute value
Add numbers
Add or update struct field
Add value to date
All array elements satisfy
All of
And
Any array element satisfy
Any of
Approximate median
Approximate percentile
Arccos
Arcsin
Arctan
Arctan2
Area
Array add
Array cartesian product
Array concat
Array contains
Array contains null
Array difference
Array distinct
Array element
Array elements are distinct
Array flatten
Array intersect
Array maximum
Array minimum
Array position
Array remove
Array repeat
Array reverse
Array sort
Array sort by struct key
Array union
Arrays have intersection
Arrays zip
Base 64 decode to string
Base64 decode
Base64 encode
Bit shift left
Bit shift right
Buffer H3 indices
Calculate destination point
Calculate haversine distance
Case
Cast
Ceil
Change timestamp time zone
Character-wise translate string
Chunk string
Cipher decrypt
Cipher encrypt
Cipher hash
Clean string
Collect array
Collect distinct array
Compact a set of H3 indices
Concatenate strings
Construct GeoPoint column
Construct delegated media Gotham identifier (GID)
Convert DMS to GeoPoint
Convert GeoPoint to Geohash
Convert GeoPoint to MGRS
Convert GeoPoint to geometry
Convert MGRS to GeoPoint
Convert a string to date
Convert a string to timestamp
Convert base
Convert between angle units
Convert between distance units
Convert between time units
Convert between weight units
Convert data to JSON
Convert from Ontology GeoPoint
Convert from hexadecimal
Convert from hexadecimal to string
Convert geocentric coordinates to WGS 84 geodesic coordinates
Convert legacy OffsetDateTime
Convert linestring to polygon
Convert timestamp from UTC
Convert timestamp to UTC
Convert to Ontology GeoPoint
Convert to hexadecimal
Convert to octal
Cosine
Covariance
Create GeoPoint from coordinate system
Create an empty array
Create array
Create ellipse geometry
Create geodesic line string
Create linestring geometry
Create map from arrays
Create null value
Create range fan geometry
Create simple geometries from ordered rows of GeoPoints
Create struct column
Create time series reference values
Current date
Current timestamp
Date sequence
Decode Geobuf as GeoJSON
Dense rank
Distinct count
Divide numbers
Encode GeoJSON as Geobuf
Ends with
Epoch milliseconds to date
Epoch milliseconds to timestamp
Epoch seconds to date
Epoch seconds to timestamp
Equals
Explode array
Explode array with position
Explode map
Exponential
Extract all regex matches
Extract date part
Extract document metadata
Extract imagery metadata
Extract map keys
Extract map values
Extract text from PDF
Extract text from PDF (using OCR)
Extract text from images (using OCR)
Extract timestamp part
Filter array elements
Filter by geometry type
First
First non null value (coalesce)
Floor
Format date as string
Format number
Format string
Format timestamp as string
Geometries have intersection
Geometry 3d affine transformation
Geometry array (unary) union
Geometry array line dissolve
Geometry buffer
Geometry centroid
Geometry contains
Geometry difference
Geometry explode to array
Geometry intersection
Geometry length
Geometry rotate 2d
Geometry set z-coordinate
Geometry shortest distance
Geometry standardize
Geometry symmetric difference
Geometry translate expression
Geometry union
Get H3 index
Get H3 indices covering a geometry
Get XZ curve index of an envelope
Get bearing from start point to end point
Get geometry envelope
Get lat/long bounding box struct
Get neighbors of an H3 index
Get struct field
Get the convex hull of a geometry
Greater than
Greater than or equals
Greatest
Grouped geometry envelope
Grouped geometry union
Grouped latitude/longitude bounding box
Gzip decompress
H3 cell to children
H3 cell to parent
H3 to geometry
Hash sha256
Interpolate geo point along linestring
Is NaN
Is empty struct
Is in
Is not null
Is null
Is valid GeoJSON
Is valid Geohash
Is valid H3 index
Is valid MGRS
Is valid MIME type
Is valid Ontology GeoPoint
Is valid delegated media gid
Is valid media reference
Is valid rid
Is valid uuid
Join array
Lag
Last
Last day of the week/month/quarter/year
Lead
Least
Left of string
Left pad string
Length
Less than
Less than or equals
Levenshtein distance
Linear regression gradient
Logarithm
Logarithm with base
Logical type cast
Lowercase
Map values
Max
Max by
Mean
Min
Min by
Mode
Modulo
Multiply numbers
Negate
Normal random number
Not
Nth chain in polygon
Nth point in linestring
Nullify empty string
Or
PDF table of contents
Parse GeoJSON from a non-WGS 84 coordinate system
Parse XML as schema
Parse classification string
Parse duration
Parse json as struct
Parse phone number
Parse well known binary as geometry
Parse well known text as geometry
Percent rank
Perimeter
Pivot
Positive modulo
Power of
Prepare geometry
Product
Rank
Reduce array elements
Regex extract
Regex find
Regex index
Regex match
Regex replace
Rename struct field
Right of string
Right pad string
Round number
Row count
Row number
Sample covariance
Sample variance
Secant
Sentence case
Sequence
Similarity score
Simplify geometry
Sine
Skip bytes
Slice array
Soundex
Split string
Square root
Standard deviation
Starts with
String after delimiter
String before delimiter
String contains
Substring
Subtract multiple expressions
Subtract numbers
Subtract timestamp/date
Subtract value from date
Sum
Sum of array elements
Tangent
Text segmentation
Text to embeddings
Timestamp add
Timestamp sequence
Timestamp subtract
Timestamp to epoch millis
Timestamp to epoch seconds
Title case
Transcribe audio into json using cpu
Transcribe audio into json using gpu
Transcribe audio into text
Transform array element
Transform map keys
Transform map values
Trim whitespace
Truncate date
Truncate timestamp
Uncompact a set of H3 indices
Unicode normalize
Uniform random number
Universally unique identifier (uuid) (unstable)
Uppercase
Url decode
Url encode
Use LLM
Value from map
Variance
Pipeline Builder Transforms
Aggregate
Aggregate on condition
Aggregate over window
Anti join
Apply expression
Array elements to columns
Assign timestamps and watermarks
Coalesce data
Compute if expression absent
Convert media set to table rows
Cross join
Date distribution
Drop columns
Drop duplicates
Empty file
Empty media set file
Empty table
Extract file metadata from dataset as rows
Extract many struct fields
Extract rows from a CSV file
Extract rows from a GeoJSON file
Extract rows from a JSON file
Extract rows from a dataset of email files
Extract rows from a dataset of text files
Extract rows from an XML file
Extract rows from shapefile
Filter
Filter files
First union by name
Flatten struct
Frequent pattern growth
Geo distance inner join
Geo distance left join
Geo intersection inner join
Geo intersection left join
GeoPoint-to-GeoPoint 3d distance inner join
Geometry intersection join
Geometry knn inner join
Geometry knn left join
Get media references (datasets)
Heartbeat detection
Inner join
Join
K-means clustering
KNN join
Keeps duplicates
Key by
Left join
Left lookup join
Manually entered table
Mapping join
Narrow union by name
Normalize column names
Numeric distribution
Outer caching join
Outer caching join
Outer join
Pivot
Project
Project on condition
Project over window
Rename columns
Repartition data
Rollup
Row size
Select columns
Semi join
Sort
Text block
Time bounded drop duplicates
Time bounded drop out of order
Time bounded event time sort
Top rows
Union by name
Union files
Unpivot
Wide union by name
Window
Code Repositories
Overview
Navigation
Configuration
Configure Code Repositories settings in Control Panel
FAQ
Transforms
Create transforms
Preview transforms
Debug transforms
Use project references
Analyze the impact of changes
Unit tests
Pin Spark modules in-platform
Libraries
Documentation
AIP features
Add dataset transformation to Marketplace product [Beta]
Artifact Repositories
Overview
Navigation
Create an Artifact repository
Delete an Artifact repository
Publish an Artifact
Recall an Artifact
Manage permissions
Advanced workflows
Create custom checks
Prepare datasets for download
Administer repositories
Overview
Branch settings
Repository settings
Repository upgrades
Spark profiles
Artifact settings
Ontology imports
Advanced repository settings
Compute Usage
Data Lineage
Overview
Navigation
FAQ
Graphs
Explore data lineage
Explore artifacts and ontology entities
Save and share a graph
Node coloring
Graph elements reference
Understand and manage datasets
View dataset preview and logic
View build timeline
Understand out-of-date datasets
Find datasets with a given column
Build datasets
Manage schedules
Understand permissions
Check resource permissions
See the impact of marking changes
Data Health
Overview
Builds and checks FAQ
Health checks
Types of checks
Check evaluation
Watching checks
Notifications and issues
Checks reference
Check groups
Overview
Create and watch a check group
View and understand a check group
Add health checks to Marketplace product [Beta]
Dataset Preview
Overview
CSV parsing
SQL preview
FAQ
Linter
Overview
Modes
Recommendations
Rules
Sweep schedules
Impact tracking
Preparation
Overview
Getting started
Create a simple preparation
Project references
Basic examples
Advanced examples
FAQ
Recipes
Overview
Core concepts
Create a recipe
View all recipes
Configure notifications
Edit a recipe
Transforms
Python
Overview
Getting started
Python version support
Basic transforms
Transforms and pipelines
Project structure
Virtual tables
Read and write unstructured files
Unit tests
Debugging
Incremental transforms
Overview
Reference
Examples
Abort transactions
Create historical dataset from snapshots
Python environment
Overview
Environment creation overview
Troubleshooting guide
Libraries
Discover and use Python libraries
Share Python libraries
Set up local development
Accelerated Spark transforms
Accelerate Spark with Velox
Lightweight transforms
Overview
Lightweight transforms API
Examples of Lightweight transforms
AIP
Orchestrators
Palantir-provided models
Use Palantir-provided language models within transforms
Container transforms
Overview
Spark sidecar transforms
Data expectations
Getting started
Reference
Read files in a repository
Output column metadata
API Reference
Transforms
Transforms classes
Foundry connectors
Use media sets with Python transforms
PySpark Reference
Overview
Coming from Python
Syntax cheat sheet
Style guide
Concept: Columns
Concept: Queries
Concept: User-defined functions
Filtering
Dates and timestamps
Strings
Math
Joins
Aggregation and pivot tables
Window
Logging
Other
Java
Overview
Getting started
Basic transforms
Transforms and pipelines
Examples
Read and write unstructured files
Unit tests
Advanced configuration
Incremental transforms
User-defined functions
Share code across repositories
Set up local development
Syntax cheat sheet
SQL
Overview
Spark SQL Reference
R
Overview
Getting started
Common
Local preview
Transforms versions
Data formats
Time series
Overview
Concepts glossary
Time series setup
Set up a time series
Create or select a time series object type
Time series properties
Time series syncs
Sensor object type setup
Time series permissions
Advanced setup
Time series alerting
Overview
Set up a time series alerting automation
FAQ
Derived series
Overview
Set up a derived series
Manually save derived series to the Ontology
Manage derived series
Derived series permissions
FAQ
Using time series
Use time series in Foundry
Use time series in Functions
Work with time series in FoundryTS
Time series property use case
Overview
Create time series properties with Pipeline Builder
Add time series properties to objects with Ontology Manager
Use time series properties on objects in a Workshop module and Quiver analysis
Sensor object types use case
Overview
Create sensor object type data in Pipeline Builder
Create sensor object types with Ontology Manager
Use sensor object type time series data in Workshop and Quiver
Compute usage
FAQ
Geospatial
Overview
Types of geospatial data
Coordinate reference systems and projections
Example workflows
Use raster data
Use vector data in transforms
Use geospatial data in the Ontology
Add Ontology data to Gaia
Geotemporal series [Beta]
Overview
Concepts glossary
Data modeling
Integrating geotemporal series with the Ontology
FAQ
Media sets (unstructured data)
Add a DICOM media set
Transcribe an audio media set
Microsoft Excel
Transforms Excel Parser
S3-compatible API for Foundry datasets
Data connectivity & integration
Pipeline Builder Transforms
Filter files
Filter files
Supported in: Batch
Filters a dataset of files.
Transform categories
: File
Declared arguments
Dataset
- Dataset of files to process.
Files
File filter to apply
- The filter expression to apply on the files.
Expression<Boolean>
Contents
Filter files
Declared arguments