Foundry's Iceberg catalog extends standard Iceberg so that each branch can independently track its own current schema, default partition spec, and default sort order. These operations are isolated to the branch they run against and do not affect the table's main (master) branch or any other branch. This means Iceberg tables match the behavior Foundry users already expect from catalog datasets, where work on a branch is fully isolated, while still remaining compliant with the Iceberg REST catalog specification ↗.
This page explains how branch isolation works, how it extends default Iceberg behavior, and how it is automatically applied when using Iceberg inside Foundry. It also covers how to opt into this behavior when using external Iceberg clients.
The branch isolation described on this page applies when running jobs through Foundry's build system. When writing to Foundry's Iceberg catalog from an external client, standard Iceberg branching behavior applies by default. Alternatively, you can opt into Foundry's branch isolation. See Using branches with external clients for details.
When running a build in Foundry, Foundry provides branch isolation for Iceberg tables. This means Foundry automatically isolates schema, partition spec, and sort order changes to the associated branch, ensuring these branch-scoped changes do not unintentionally modify other branches. Users do not need to take any action to configure this; it is the default behavior for writing to Iceberg tables using Foundry's build system.
Note that the branch-scoped behavior described here for schemas also applies to partition spec and sort order, which (like schemas) can be evolved independently on a branch in Foundry. The branch-scoped metadata is stored as additional properties in your table's Iceberg metadata file. For example: "foundry.branch.feature-branch.schema-id" : "3".
In standard Iceberg, the schema is a table-level concept shared across all branches, where there is a single current schema tracked for the table across all branches. See Apache Iceberg documentation ↗ for details. This means that a modification to the table's schema on any branch will change the schema for main and every other branch.
Having a single schema across all branches makes it difficult to develop or validate schema changes in isolation on a branch before merging into production. To test schema changes on a branch, you would need to modify your table's global schema. Any such change, regardless of which branch it originates from, would immediately affect all consumers of the table on main.
Foundry's Iceberg catalog solves this by automatically tracking a branch-scoped schema, partition spec, and sort order. A job running on a branch (for example, my-branch) sees the schema that that branch is tracking, not the schema on main. Schema changes made within that job on my-branch are stored for the associated branch (in this case, my-branch) and leave the current schema on main entirely unchanged.
The following table summarizes what is branch-scoped versus table-scoped and shared across all branches:
| Metadata | Foundry-extended branching | Default Iceberg branching |
|---|---|---|
| Schema history | Table-scoped | Table-scoped |
| Table properties | Table-scoped | Table-scoped |
| Current snapshot | Branch-scoped | Branch-scoped |
| Current schema | Branch-scoped | Table-scoped |
| Default partition spec | Branch-scoped | Table-scoped |
| Default sort order | Branch-scoped | Table-scoped |
Note that in Foundry-extended branching while the current schema is branch-scoped, the history of schemas is table-scoped. This means that when a branch adds a new column, that schema is added to the table's shared schema list, but is not set as the current schema on main.
When a job runs on a branch in Foundry's build system, Foundry automatically injects the branch context into all catalog operations for that job. You do not need to configure anything as a user. Your pipelines read inputs and write outputs against the branch without any additional code.
The examples below illustrate the two most common scenarios: creating a new table on a branch, and evolving the schema of an existing table on a branch.
Suppose you are building a new table on a development branch feature/customer-scores. Your transform creates an output Iceberg table and writes initial data to it.
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13 14import pyarrow as pa from transforms.api import transform from transforms.tables import TableOutput, TableLightweightOutput @transform.using(customer_scores=TableOutput("/<path>/customer_scores")) def compute(customer_scores: TableLightweightOutput): data = pa.table( { "customer_id": pa.array([1, 2, 3], type=pa.int64()), "score": pa.array([0.91, 0.74, 0.88], type=pa.float64()), } ) customer_scores.iceberg().write(data)
Here is the state of the customer_scores table after writing to it on the feature branch:
| Foundry-extended branching | Default Iceberg branching | |
|---|---|---|
Schema on feature/customer-scores | New schema | New schema |
Schema on main | Empty | New schema |
Data on feature/customer-scores | Exists | Exists |
Data on main | None | None |
With Foundry's branch isolation: The branch owns the initial schema and snapshot. The table exists on main, but has an empty schema with no data.
With default Iceberg branches: The branch's schema appears on main.
Suppose you now have the table customer_scores already in production. You are testing some potential changes to the pipeline and want to add a new column and modify an existing column type, as part of a job running on branch feature/scoring-v2, without affecting what downstream production consumers see on main.
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17import pyarrow as pa from transforms.api import transform from transforms.tables import TableOutput, TableLightweightOutput @transform.using( customer_scores=TableOutput("/<path>/customer_scores") ) def compute(customer_scores: TableLightweightOutput): data = pa.table( { "customer_id": pa.array(["c1", "c2", "c3"], type=pa.string()), "score": pa.array([0.91, 0.74, 0.88], type=pa.float64()), "score_v2": pa.array([0.95, 0.80, 0.92], type=pa.float64()), } ) customer_scores.iceberg().write(data)
Here is the state of the table after the job runs on feature/scoring-v2:
| Foundry-extended branching | Default Iceberg branching | |
|---|---|---|
Schema on feature/scoring-v2 | New schema | New schema |
Schema on main | Old schema | New schema |
Data on feature/scoring-v2 | Exists | Exists |
Data on main | Unchanged | Unchanged |
With Foundry's branch isolation: The new column and type change are isolated to the branch. main retains its original schema, so downstream consumers are unaffected.
With default Iceberg branches: The branch and main share a single current schema, so the write either fails schema validation or forces you to update main's schema, exposing the change to downstream consumers.
Foundry's Iceberg catalog is fully compliant with the Iceberg REST catalog specification ↗.
When accessing Foundry's Iceberg catalog from an external tool, the catalog's behavior depends on whether the X-PLTR-Branch header is set. Set the X-PLTR-Branch header when working with branches from external clients to ensure consistent Foundry branching behavior.
When the X-PLTR-Branch header is set:
X-PLTR-Branch header is used by default when no branch qualifier is specified (or if main is referenced). If a specific (non-main) branch is provided in the request, then that branch will be used rather than the header branch. For example, if the header is set to featurebranch1, then SELECT * FROM table and SELECT * FROM table.branch_main will reference featurebranch1, but SELECT * FROM table.branch_featurebranch2 will reference featurebranch2.When the X-PLTR-Branch header is not set, default Iceberg behavior applies:
SELECT * FROM table VERSION AS OF <snapshot_id>.main's schema (table-scoped). For example, SELECT * FROM table.branch_my_branch.main's schema (table-scoped); writes that do not conform fail rather than adapting to the branch.main, and are not isolated to the branch.X-PLTR-Branch header externallyTo set the branch for a PyIceberg client, add the header to your catalog properties:
Copied!1 2 3 4 5 6 7 8 9 10from pyiceberg.catalog import load_rest catalog = load_rest( "foundry", { "uri": "https://your.foundry/iceberg", "token": "eyJwb...", "header.X-PLTR-Branch": "my-branch" }, )
For Spark, pass the header as a catalog configuration property:
Copied!1 2 3 4 5 6 7spark = ( SparkSession.builder ... .config("spark.sql.catalog.foundry.header.X-PLTR-Branch", "my-branch") ... .getOrCreate() )
For more information on connecting external clients to Foundry's catalog, see Authenticating Iceberg clients and Example: Local Jupyter.
Note the following considerations when working with branch isolation.
When a job runs on a branch, that branch context applies uniformly to all tables the job reads from and writes to. It is not currently possible to configure different branches for different input or output tables within the same job. Foundry validates that all resolved input branches match the job's branch, and raises an error if they do not.
If a branch is tracking a particular schema or partition spec, that schema or spec cannot be dropped from the table even by operations running on main. This prevents branches from being silently broken by cleanup or maintenance operations running concurrently. The protection is lifted automatically when the branch is deleted.
Iceberg's native branch merge procedures, such as cherry-pick and fast-forward, operate on snapshot references only and are not aware of Foundry's branch-scoped schema properties. If you use these procedures to merge a branch into main, the branch's schema will not be carried over, and main will retain its existing schema. To ensure schema changes on a branch are reflected on main, merge your code and rebuild on main through Foundry's build system.
Some privileged REST endpoints cannot be called from Foundry branches, notably:
dropNamespaceupdateNamespacedropTablerenameTablecommitTransactioncreateViewdropViewrenameViewBranch isolation is available from transforms-tables library version 0.1211.0 and greater.