A class representing the files backing a Foundry dataset view.
Prefer using the static Dataset.get() factory method instead of calling the constructor directly.
Create a new Dataset instance for the given alias.
Dataset instance.DatasetThe alias of the dataset.
The Foundry field schema of the dataset.
FoundryFieldSchemaThe path on disk for the dataset files to be used with write_table.
An object store path to a bucket that will be mapped into the output transaction.
Read a tabular Foundry dataset as a pandas DataFrame, Polars DataFrame, Arrow Table, or raw file path.
"arrow", "pandas", "dataframe" (alias for pandas, default), "polars", "lazy-polars", or "path". When set to "path", a path pointing to the raw dataset files is returned."current", "previous", or "added". Defaults to "current".False.pyarrow.Table ↗ | pandas.DataFrame ↗ | polars.DataFrame ↗ | polars.LazyFrame ↗ | str ↗When columns, row_limit, or filters applied via the where() method are set, the output format must be one of "arrow", "dataframe", "pandas", or "polars", and mode must be "current".
Upload tabular data to a Foundry dataset. This uploads the data, infers a schema, and updates column description metadata.
Accepts a pandas DataFrame, Arrow Table, Polars DataFrame, DuckDB PyRelation, or a path (string or pathlib.Path) pointing to a raw dataset.
pandas.DataFrame ↗, pyarrow.Table ↗, polars.DataFrame ↗, DuckDB PyRelation, or a path matching write_table_path.Finalize a dataset after uploading raw Parquet files. This infers a Foundry schema from the uploaded Parquet and updates column description metadata on the dataset.
You must call this method after one or more Parquet files have been uploaded to the output dataset so that a schema can be inferred. The method will throw if it is called before a successful file upload.
Set the write mode of the dataset.
"replace", "modify", or "append". In modify mode, anything written is appended to the dataset and may also override existing files. In append mode, anything written is appended to the dataset and will not override existing files. In replace mode, anything written replaces the dataset.The write mode cannot be changed after data has been written.
List files in a Foundry dataset.
FileCollectionUpload a local file to a Foundry dataset.
Upload a local directory to a Foundry dataset. All files found recursively inside the directory will be uploaded.
Apply a row filter to the dataset. Returns the dataset so that calls can be chained. Filters are applied when read_table is called.
Column.get().DatasetSupported operators on Column:
==, !=, >, >=, <, <=.isnull().isin(values).between(lower, upper)Combine filters with & (and), | (or), and ~ (not).
Copied!1 2 3 4 5 6from foundry.transforms import Dataset from foundry.transforms import Column ds = Dataset.get("my_dataset") filtered = ds.where(Column.get("age") > 18) result = filtered.read_table(format="pandas")
Select a subset of columns from the dataset. Returns the dataset so that calls can be chained.
DatasetSet the maximum number of rows to read. Returns the dataset so that calls can be chained.
DatasetAbort all work on this dataset. Any data written before or after calling this method will be ignored.
Dataset