+
K
As discussed in the overview for datasets, unstructured data in Foundry is stored as a collection of files in a dataset just like tabular data.
These are some features that work identically between pipelines on structured and unstructured data:
Some differences from pipelines on tabular data include:
To get started with pipelines on unstructured data, refer to the relevant parts of documentation for Python and Java transforms:
Once unstructured data has been cleaned and normalized, you can use Code Workbook to analyze unstructured datasets and train machine learning models in Python and R. Learn more about unstructured data access in Code Workbook.