Ontology SQL

Beta

Ontology SQL is in the beta phase of development and some features may not be available on your environment. Functionality may change during active development. Contact Palantir support to request access to Beta features.

Ontology SQL is Foundry's SQL query engine for ontology objects, purpose-built to bring familiar SQL semantics to your ontology data. Ontology SQL lets you run SQL queries directly against object types, many-to-many links, and interfaces. With Ontology SQL, you can query ontology objects using standard Spark SQL syntax, executing directly against object storage. Ontology SQL is built on Foundry's SQL infrastructure, providing a single SELECT statement interface that supports variables, standard SQL functions, and seamless integration with your existing ontology modeling.

Ontology SQL architecture

Ontology SQL provides a unified SQL experience for querying ontology objects, sharing the same SQL dialect and front-end experience as Foundry's other SQL offerings while operating with its own specialized backend.

  • Unified SQL dialect: Ontology SQL shares the same SQL parsing layer and ANSI-compliant Spark SQL dialect as Furnace, ensuring consistent syntax and behavior across Foundry's SQL ecosystem. This provides a familiar SQL experience, while the underlying query planning and optimization are tailored specifically for ontology data.
  • Ontology data providers: Ontology SQL introduces specialized providers for object types and many-to-many link tables. These providers translate ontology concepts such as object properties, primary keys, and link relationships into SQL-compatible table schemas, enabling seamless querying of ontology data using standard SQL syntax.
  • Direct object storage access: Queries execute directly against Foundry's object storage layer, the same underlying storage used by object sets. This direct access ensures that performance and cost characteristics remain consistent with other ontology operations, with no intermediate data movement required.
  • Compute engine abstraction: After planning, the object storage service (OSS) determines which compute engine should execute the query. The compute engine is responsible for running the query, reading data, and returning results, with automatic resource management and built-in scale constraints to ensure reliable performance.
  • Consistent API experience: Ontology SQL provides a similar API surface to Furnace, delivering a consistent developer experience and enabling future extensibility as the platform evolves.
  • Supported compute engines: Ontology SQL is built on Foundry's compute engine abstraction, allowing for future optimization and updates as the platform evolves. Additionally, to leverage larger and more complex queries, some queries in Ontology SQL can be executed in Spark ↗.

Getting started with Ontology SQL

Access Ontology SQL

You can interact with Ontology SQL in the object mode SQL console. Once in the console, you will be able to query your object types using standard SQL syntax. For a complete reference of available operations, refer to the SQL dialect documentation.

Write your first query

Ontology SQL queries reference object types using their resource identifiers (RIDs). The basic syntax follows this pattern:

Copied!
1 SELECT * FROM `ri.ontology.main.object-type.<object-type-rid>`;

For example, a query for all properties from an Employee object type would be written as follows:

Copied!
1 SELECT * FROM `ri.ontology.main.object-type.00000000-0000-0000-0000-000000000000`;

A query for many-to-many links can be written as shown below:

Copied!
1 SELECT * FROM `ri.ontology.main.relation.<relation-rid>`;

Understanding column names

All object properties are referenced in queries using their API names, not their display names. The columns available in your query correspond directly to the property API names defined in your ontology. You can find property API names in Ontology Manager.

For example, if an Employee object has properties with API names employeeId, firstName, and, department, you would write a query as follows:

Copied!
1 2 SELECT employeeId, firstName, department FROM ri.ontology.main.object-type.<employee-rid>;

One-to-one and one-to-many link types do not need to be defined in Ontology Manager to be used in Ontology SQL. You can use SQL to join objects however you wish. Performance will be identical whether or not the join uses a link defined in Ontology Manager.

Many-to-many links require join tables and must be defined in Ontology Manager. To use them in your SQL joins, you must select the link type in the SQL query.

When querying many-to-many link tables, column names follow a specific convention to avoid ambiguity.

Each foreign key column is named using the pattern below:

<objectTypeApiName>_<relationApiName>

Consider a many-to-many relationship between Person and Car objects.

The Ontology setup is as follows:

  • Object type: Person (API name: person)
    • Primary key: personId
    • Link API name on Person side: vehicles
  • Object type: Car (API name: car)
    • Primary key: carId
    • Link API name on Car side: drivers
  • Link type rid: ri.ontology.main.relation.0

Below is an example query to find all cars driven by a specific person:

Copied!
1 2 3 4 5 SELECT c.* FROM `ri.ontology.main.relation.0` AS linkTable INNER JOIN `car` AS c ON c.`carId` = linkTable.`person_vehicles` WHERE linkTable.`car_drivers` = 'person-123';

And an example query to find all people who drive a specific car:

Copied!
1 2 3 4 5 SELECT p.* FROM `ri.ontology.main.relation.0` AS linkTable INNER JOIN `person` AS p ON p.`personId` = linkTable.`car_drivers` WHERE linkTable.`person_vehicles` = 'car-456';
Best practice

Always use table aliases when working with many-to-many links. Without aliases, column names can become difficult to read and maintain. Compare the following examples:

With aliases (recommended):

Copied!
1 2 3 4 SELECT c.* FROM `ri.ontology.main.relation.0` AS linkTable INNER JOIN car AS c ON c.carId = linkTable.person_vehicles WHERE linkTable.car_drivers = 'person-123';

Without aliases (harder to read):
Copied!
1 2 3 4 SELECT car.* FROM `ri.ontology.main.relation.0` INNER JOIN car ON car.carId = `ri.ontology.main.relation.0`.person_vehicles WHERE `ri.ontology.main.relation.0`.car_drivers = 'person-123';

Using variables

Ontology SQL supports variables through the DECLARE statement:

Copied!
1 2 3 4 5 DECLARE @minSalary DOUBLE = 75000.0, @department STRING = 'Engineering'; SELECT employeeId, salary FROM ri.ontology.main.object-type.<employee-rid> WHERE salary > @minSalary AND department = @department;

Variable requirements

  • The DECLARE statement must be the first statement in your query, and can only be used once. Reference the example above for how to declare multiple variables.
  • For supported variable types, refer to the SQL dialect documentation.
  • Variable names can contain letters, digits, and underscores.
  • SET statements are not allowed.

Reserved keywords

Some keywords, such as user and result, are reserved. It is best practice to wrap column names in backticks to avoid conflicts as shown below:

Copied!
1 SELECT `user`, `result` FROM ri.ontology.main.object-type.<object-rid>;

Best practices for performance

Avoid functions in filter conditions

When applying functions or transformations to columns in WHERE clauses, or when using computed columns for filtering, the query engine cannot leverage indices. This severely limits query performance and the scale of computation.

Avoid the following patterns that prevent index usage:

Copied!
1 2 3 4 5 6 7 8 9 -- Filtering on a computed column SELECT employeeId, salary, bonus FROM employee WHERE salary + bonus > 100000; -- Filtering on a function result SELECT employeeId, hireDate FROM employee WHERE YEAR(hireDate) = 2023;

Prefer the following patterns to allow index usage:

Copied!
1 2 3 4 5 6 7 8 9 -- Filter on the original column directly SELECT employeeId, salary, bonus FROM employee WHERE salary > 100000 - bonus; -- Use date range comparison instead of extracting year SELECT employeeId, hireDate FROM employee WHERE hireDate >= '2023-01-01' AND hireDate < '2024-01-01';

The same principle applies to aggregations. Filtering over columns that were transformed by any part of the query (for example, colA + colB AS newColumn) means the engine cannot use indices to efficiently locate matching rows, resulting in slower performance.

Avoid selecting unnecessary large columns

Do not select large columns like vectors (embeddings) from queries when not needed. These columns can significantly increase query execution time and memory usage without providing value if they are not required for your analysis.

Limitations and constraints

Query structure limitations

  • SELECT queries only: Ontology SQL exclusively supports read operations. No data modification (INSERT, UPDATE, DELETE) or schema definition (CREATE, ALTER, DROP) operations are available.
  • Single SELECT statement: Only one SELECT statement is allowed per query, with the exception of the DECLARE statement for variables.
  • No mixing of objects and datasets: Queries must operate entirely on ontology objects or entirely on datasets. Mixing the two in a single query is not supported.

Data type constraints

  • Struct columns must be excluded from queries. Including a struct-type column will cause the query to fail.
  • Vector columns (embeddings) are technically supported, but are often excluded by front-end applications due to their size and limited utility in SQL contexts.
  • All other ontology column types are supported and automatically converted to SQL-compatible types.

Scale and performance constraints

  • Output limit: The Ontology SQL service truncates results at 10,000 rows maximum. Even if no limit is provided in the query, the result will be capped to 10,000 rows. OFFSET is supported as long as OFFSET + LIMIT is less than or equal to 10,000. For example, OFFSET 1000 LIMIT 1000 works, but OFFSET 9000 LIMIT 2000 does not. If interacting with Ontology SQL using applications, those respective applications may enforce stricter limits.

Resource constraints

  • Jobs may be queued for up to 6 seconds during OSS Spark resource saturation.
  • Query execution timeout is 20 seconds.
  • Exceeding these durations will result in a NotEnoughSparkResources error.

Platform constraints

  • Object storage v2 only: Ontology SQL is currently available exclusively on object storage v2.
  • No branch or scenario support: Queries execute against the main branch only. Support for branches and scenarios is planned for future releases.
  • Object set limitations:
    • Object sets can be used as query inputs, depending on the front-end application.
    • Object sets cannot be directly used as query outputs.

Supported ontology features

FeatureSupport status
Object typesSupported; object type must be defined in Ontology Manager
Many-to-many linksSupported; link type must be defined in Ontology Manager
InterfacesNot supported
One-to-one and one-to-many linksSupported; can be used directly in Ontology SQL as defining the link type in Ontology Manager is not required
BranchingNot supported
ScenariosNot supported
Write operationsNot supported