Documentation

Data connectivity & integrationAvailable connectorsSnowflake

Snowflake

Connect Foundry to Snowflake to read and sync data between Snowflake and Foundry.

Supported capabilities

Capability	Status
Exploration	🟢 Generally available
Bulk import	🟢 Generally available
Incremental	🟢 Generally available
Virtual tables	🟢 Generally available
Compute pushdown	🟢 Generally available
Export tasks	🟡 Sunset
Table Exports	🟡 Beta

Setup

Open the Data Connection application and select + New Source in the upper right corner of the screen.
Select Snowflake from the available connector types.
Choose to use a direct connection over the Internet or to connect through an intermediary agent.
Follow the additional configuration prompts to continue the setup of your connector using the information in the sections below.

Learn more about setting up a connector in Foundry.

Connection details

Option	Required?	Description
`Account identifier`	Yes	This is the identifier that precedes ".snowflakecomputing.com". See Snowflake's official documentation ↗ for more details.
`Roles`	No	This is the default role to be used by the connection in case the credentials provided have access to multiple roles.
`Database`	Yes	Specify a default database to use once connected.
`Schema`	No	Option to specify a default schema to use once connected. If not specified, all schemas will be available that are in-scope of the credentials.
`Warehouse`	No*	The virtual warehouse to use once connected. In the case of registered virtual tables, this will be used for any source-side compute.
`Credentials`	Yes	Option 1: Username and password Provide a username and password. We recommend the use of service credentials rather than individual user credentials. Option 2: Key pair authentication Provide a username and private key. See Snowflake's official documentation ↗ for details on configuring key pair authentication. Note that only unencrypted private keys are supported. Foundry will encrypt and store the private key securely. Option 3: External OAuth (OIDC) Follow the displayed source system configuration instructions to set up External OAuth. See Snowflake’s official documentation ↗ for details on External OAuth and our documentation for details on how OIDC works with Foundry. For all credential options, ensure that the provided user and role has usage privileges on the target database(s) and schema(s), as well as select privileges on the target table(s). When registering virtual tables, the user and their role should also have usage privileges on the warehouse.
`Network Connectivity`	Yes**	Run `SELECT SYSTEM$ALLOWLIST()` in Snowflake and ensure that at least the entries for `SNOWFLAKE_DEPLOYMENT` and `STAGE` are added as egress policies in Foundry. Refer to the networking section below for more details.

* Warehouse details are optional for syncing Foundry datasets, but required for registering virtual tables.
** Network egress policies are required for direct connections, but not for agent-based connections.

Networking

To enable direct connections between Snowflake and Foundry, the appropriate egress policies must be added when setting up the source in the Data Connection application.

To identify the hostnames and port numbers of your Snowflake account to be allowlisted, you can run the following command in your Snowflake console. Ensure that at least the entries for SNOWFLAKE_DEPLOYMENT and STAGE are added as egress policies in Foundry.

Copied!1
2
3
4
SELECT t.VALUE:type::VARCHAR as type,
       t.VALUE:host::VARCHAR as host,
       t.VALUE:port as port
FROM TABLE(FLATTEN(input => PARSE_JSON(SYSTEM$ALLOWLIST()))) AS t;

See Snowflake's official documentation ↗ for additional information on identifying hostnames and port numbers to allowlist.

Connections from Foundry to Snowflake normally come from the default public gateway IPs for your environment. However, traffic within the same cloud provider (for example, AWS-AWS or Azure-Azure) may use different routing, and require establishing a connection via PrivateLink. See below for the additional setup required per cloud provider, or contact your Palantir representative for additional guidance.

Snowflake instance hosted on S3

If your Snowflake instance is configured to route internal S3 stage traffic through a VPCE ↗, the Snowflake JDBC driver must be manually configured to not use the custom VPCE domain. Otherwise, the driver will be routed to the custom VPCE domain (which is inaccessible from Foundry's VPC) and will fail connections to URLs with the format of <bucketname>.bucket.vpce-<vpceid>.s3.<region>.vpce.amazonaws.com.

You can manually configure this by adding a JDBC connection property in the Connection details of your instance, with a key of S3_STAGE_VPCE_DNS_NAME and an empty value field (the equivalent of setting it to null). The S3 stage traffic will then be routed through the AWS S3 Gateway Endpoint (<bucketname>.bucket.s3.<region>.vpce.amazonaws.com) which maintains private connectivity so traffic will not be routed through the public internet.

Review our PrivateLink egress documentation ↗ for more information.

For egress policies that depend on an S3 bucket in the same region as your Foundry instance, ensure you have completed the additional configuration steps detailed in our Amazon S3 bucket policy documentation for the affected bucket(s).

Snowflake instance hosted on Azure

The Snowflake JDBC driver used for the Foundry Snowflake connector may attempt to connect directly to an underlying “internal stage” storage bucket when fetching data. For Snowflake hosted on Azure, because Azure-hosted Foundry enrollments route traffic over Azure service endpoints, network connectivity from Foundry to the underlying stage buckets must be explicitly allow-listed by following the instructions below.

Gather the required information about your Snowflake warehouse

You will need the following information about your Azure-hosted Snowflake warehouse to establish network connectivity to Foundry:

Full list of system allowlist domains.
Your Azure storage account identifier, obtained from Snowflake.

System allowlist domains

Use the SYSTEM$ALLOWLIST command to get the full list of domains that may be required to successfully connect.

Note: This is the same command than used above to define egress policies, and is explained in the network panel callout in Data Connection.
This list will include the domain of an Azure storage bucket used as the stage for your Snowflake warehouse.

Azure storage account identifier

For the Azure storage bucket returned from the SYSTEM$ALLOWLIST command, you will also need to retrieve the storage account identifier.

If you are using Snowflake Standard Edition or Enterprise Edition, you will need to file a ticket with Snowflake support to request the storage account identifier.
If you are using Snowflake Business Critical Edition ↗, you can retrieve the storage account identifier with the following steps:
- Set the ENABLE_INTERNAL_STAGES_PRIVATELINK ↗ parameter to TRUE for the account.
- Then, call the SYSTEM$GET_PRIVATELINK_CONFIG() ↗ function, which returns a field called privatelink-internal-stage containing the Azure storage account resource identifier.
  - Note that even if you are not connecting over a PrivateLink, you still need to retrieve and provide the storage account resource identifier.

A full Azure Storage account resource identifier will be in the following format:

/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Storage/storageAccounts/{storageAccountName}

More information on how to find an Azure Storage account resource ID directly in the Azure console can be found in the Azure documentation ↗. Restricting cross-account network traffic using VNET rules and the storage account identifier is in line with Microsoft’s published best practices ↗, and should be used for all connections to Azure-hosted Snowflake warehouses from within Azure compute instances.

Allow outbound traffic from Foundry to the Azure storage account associated with your Snowflake warehouse

Now that you have gathered the required information about your Snowflake warehouse, you can create the required policies needed to enable Foundry access to your Snowflake data.

Create a standard egress policy for the Azure storage internal stage, and attach it to your Snowflake source.
- Note that you should add policies for everything returned from the SYSTEM$ALLOWLIST command, and not just the storage bucket domain.
Create an Azure storage policy, pasting in the storage account resource identifier.

Navigate back to your Snowflake source in Data Connection and confirm you can explore the source and run syncs.

Virtual tables

This section provides additional details around using virtual tables with a Snowflake source. This section is not applicable when syncing to Foundry datasets.

The table below highlights the virtual table capabilities that are supported for Snowflake.

Capability	Status
Bulk registration	🟡 Beta
Automatic registration	🟢 Generally available
Table inputs	🟢 Generally available: tables, views, materialized views in Code Repositories, Pipeline Builder
Table outputs	🟢 Generally available: tables in Code Repositories, Pipeline Builder
Incremental pipelines	🟢 Generally available: `APPEND` only ^[1]
Compute pushdown	🟢 Generally available

Consult the virtual tables documentation for details on the supported Foundry workflows where Snowflake tables can be used as inputs or outputs.

[1] To enable incremental support for pipelines backed by Snowflake virtual tables, ensure that Change Tracking ↗ and Time Travel ↗ are enabled for the appropriate retention period. This functionality relies on CHANGES ↗ The current and added read modes in Python Transforms are supported. These will expose the relevant rows of the change feed based on the METADATA$ACTION column. The METADATA$ACTION, METADATA$ISUPDATE, METADATA$ROW_ID columns will be made available in Python Transforms.

Source configuration requirements

When using virtual tables, remember the following source configuration requirements:

You must set up the source as a direct connection. Virtual tables do not support use of intermediary agents.
Ensure that bi-directional connectivity and allowlisting is established as described in the Networking section of this documentation.
If using virtual tables in Code Repositories, refer to the Virtual Tables documentation for details of additional source configuration required.
You must specify a warehouse in the connection details.
The credentials provided must have usage privileges on the warehouse.

See the Connection Details section above for more details.

Compute pushdown

Foundry offers the ability to push down compute to Snowflake when using virtual tables. Virtual table inputs leverage the Snowflake Spark connector ↗ which has built-in support for predicate pushdown.

When using Snowflake virtual tables registered to the same source as inputs and outputs to a pipeline, it is possible to fully federate compute to Snowflake. This feature is currently available in Python transforms. See the Python documentation for details on how to push down compute to Snowflake.

Data model

Note that columns of type array ↗, object ↗, and variant ↗ will be parsed by Foundry as type string. This is due to the source's variable typing.

For example, the Snowflake array [ 1, 2, 3 ] would be interpreted by Foundry as the string "[1,2,3]".

See Snowflake's official documentation ↗for more details.