Before getting started with your data transformation, it’s important to consider the benefits as well as the limitations of each language. This table includes a summary of the key differences between the supported languages:
Description | SQL | Python | Java |
---|---|---|---|
Non-proprietary language: documentation available online | ✓ | ✓ | ✓ |
Support for file access: read and write files in Foundry datasets—this means your data transformation can operate on unstructured data | ✓ | ✓ | |
Transform Level Logic Versioning (TLLV): more info in the TLLV section | ✓ | ✓ | |
Incremental computation: more info in the incremental computation section | ✓ | ✓ | |
Support for removing inherited markings | ✓ | ✓ | ✓ |
Multiple output datasets allowed per file | ✓ | ✓ | |
Support for dataset previews | ✓ | ✓ | ✓ |
Custom Transforms profiles | ✓ | ✓ | ✓ |
SQL is a language that has plenty of external documentation available online. Here are some key benefits of writing data transformations in SQL:
Learn more about SQL Transforms.
Python is a language with plenty of external documentation available online. You may want to write data transformations in Python so that you can take advantage of the language-specific capabilities and libraries of Python. The Python API is lower-level than other languages like SQL. Here are some key benefits of using Python:
transforms
Python library is an API that exposes functionalities such as file reads and writes. File-based data transformations can be useful early on in data transformation pipelines when you want to parse and clean data.Learn more about Python Transforms.
Java is a language with plenty of external documentation available online. You may want to write data transformations in Java so that you can take advantage of the language-specific capabilities in Java. Java is a lower-level API than other languages like SQL. Here are some key benefits of using Java:
transforms
Java library is an API that exposes functionalities such as file reads and writes. File-based data transformations can be useful early on in data transformation pipelines when you want to parse and clean data.