Virtual tables overview

Virtual tables allow you to query and write to tables in supported data platforms without storing the data in Foundry.

You can interact with the tables in Python transforms with the transforms-tables library.

Prerequisites

To interact with virtual tables from a Python transform, you must:

  1. Upgrade your Python repository to the latest version.
  2. Install transforms-tables from the Libraries tab.

API overview

The Pythonic virtual tables API provides TableInput and TableOutput types to interact with virtual tables.

Copied!
1 2 3 4 5 6 7 8 9 from transforms.api import transform from transforms.tables import TableInput, TableOutput, TableTransformInput, TableTransformOutput @transform( source_table=TableInput("ri.tables.main.table.1234"), output_table=TableOutput("ri.tables.main.table.5678"), ) def compute(source_table: TableTransformInput, output_table: TableTransformOutput): ... # normal transforms API

The tables referred to in a Python transform need not come from the same source, or even the same platform.

The above example relies on the tables specified in the transform to already exist within your Foundry environment. If this is not the case, you can configure the output virtual table to be created during checks, as with dataset outputs. This requires extra configuration to specify the source and location where the table should be stored.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 from transforms.api import transform from transforms.tables import TableInput, TableOutput, TableTransformInput, TableTransformOutput, SnowflakeTable @transform( source_table=TableInput("ri.tables.main.table.1234"), output_table=TableOutput( "/path/to/new/table", # Must specify the Data Connection source you want to create the table in and the table identifier/location "ri.magritte..source.1234", SnowflakeTable("database", "schema", "table"), ), ) def compute(source_table: TableTransformInput, output_table: TableTransformOutput): ... # normal transforms API

Once created, the extra configuration for the source and table metadata can be removed from the TableOutput to be more concise. Once a virtual table has been created, it is not possible to change the source or location. Modifying the source or location will cause checks to fail.

The available Table subclasses are:

  • BigQueryTable(project: str, dataset: str, table: str)
  • DeltaTable(path: str)
  • FilesTable(path: str, format: FileFormat)
  • IcebergTable(table: str, warehouse_path: str)
  • SnowflakeTable(database: str, schema: str, table: str)

You must use the appropriate class based on the type of source you are connecting to.