You can uncomment the lines related to lints in the transforms-python/build.gradle in your code repository. This will enable a linting task that will provide hints to violations of either pep8 or pylint format rules.
Timestamp: June 12, 2024
Auto-scaling of executors can be achieved by enabling dynamic allocation, which allows for auto-scaling of executors but not for executor/driver memory. Specific profiles such as DYNAMIC_ALLOCATION_MAX_64 and the DYNAMIC_ALLOCATION_ENABLED profile support this functionality. More information and a list of profiles with built-in configurations for dynamic allocation can be found in the Spark profiles reference documentation.
Timestamp: April 5, 2024
Selecting the Format before committing option when committing code will run the formatCode task. This task can utilize ruff or black as formatters. This can be controlled by uncommenting the respective lines related to formatters in the transforms-python/build.gradle file.
Timestamp: June 12, 2024
No module named <module-name>; <package-name> is not a package in a transform?To troubleshoot and resolve the import error, follow these steps:
Timestamp: April 25, 2024
pandas dataframe?To write a pandas dataframe, you should use the .write_pandas() method. If you encounter an AttributeError: 'DataFrame' object has no attribute '_jdf', it means you are using a method designed for pyspark dataframes on a pandas dataframe.
Timestamp: May 30, 2024
No, it is not possible to set a schedule on a transform without an output dataset. The recommended solution is to track the response from the external API in an output dataset for logging. Alternatively, a function triggered via Automate could be used to run arbitrary code, but having an output dataset is still valuable for logging.
Timestamp: February 6, 2025