Virtual tables allow you to query and write to tables in supported data platforms without storing the data in Foundry.
You can interact with the tables in Python transforms with the transforms-tables
library.
To interact with virtual tables from a Python transform, you must:
transforms-tables
from the Libraries tab.The Pythonic virtual tables API provides TableInput
and TableOutput
types to interact with virtual tables.
Copied!1 2 3 4 5 6 7 8 9
from transforms.api import transform from transforms.tables import TableInput, TableOutput, TableTransformInput, TableTransformOutput @transform( source_table=TableInput("ri.tables.main.table.1234"), output_table=TableOutput("ri.tables.main.table.5678"), ) def compute(source_table: TableTransformInput, output_table: TableTransformOutput): ... # normal transforms API
The tables referred to in a Python transform need not come from the same source, or even the same platform.
The above example relies on the tables specified in the transform to already exist within your Foundry environment. If this is not the case, you can configure the output virtual table to be created during checks, as with dataset outputs. This requires extra configuration to specify the source and location where the table should be stored.
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13 14
from transforms.api import transform from transforms.tables import TableInput, TableOutput, TableTransformInput, TableTransformOutput, SnowflakeTable @transform( source_table=TableInput("ri.tables.main.table.1234"), output_table=TableOutput( "/path/to/new/table", # Must specify the Data Connection source you want to create the table in and the table identifier/location "ri.magritte..source.1234", SnowflakeTable("database", "schema", "table"), ), ) def compute(source_table: TableTransformInput, output_table: TableTransformOutput): ... # normal transforms API
Once created, the extra configuration for the source and table metadata can be removed from the TableOutput
to be more concise. Once a virtual table has been created, it is not possible to change the source or location. Modifying the source or location will cause checks to fail.
The available Table
subclasses are:
BigQueryTable(project: str, dataset: str, table: str)
DeltaTable(path: str)
FilesTable(path: str, format: FileFormat)
IcebergTable(table: str, warehouse_path: str)
SnowflakeTable(database: str, schema: str, table: str)
You must use the appropriate class based on the type of source you are connecting to.