MODULE_UNREACHABLE
, what should I do?<TABLE_NAME>
to my pipeline, but when I try to build my pipeline it is failing with AssertionError: 0 instances of <TABLE_NAME> found in 'objects' metadata table
Yes, you can debug and preview code in an SDDI repository. In the SDDI repository, navigate to the file /transforms-bellhop/src/software_defined_data_integrations/transforms/pipeline_builder.py
and select the transform you want to preview from the Preview button.
An SDDI repository produces a dataset called BUILD
that is connected to all final datasets produced by the repository. In order to guarantee that all newly-ingested tables get built, create a new Full Build schedule (including upstream datasets) with this BUILD
dataset as a target. The smart scheduler will only initiate builds for the parts of the pipeline where the raw data had been refreshed.
MODULE_UNREACHABLE
, what should I do?MODULE_UNREACHABLE
is often a sign that DRIVER_MEMORY in your Spark environment is insufficient. You can apply Spark profiles in your SourceConfig.yaml file for selected tables; see the configuration reference for details. Do not forget to import the assigned profile to your repository config first.
<TABLE_NAME>
to my pipeline, but when I try to build my pipeline it is failing with AssertionError: 0 instances of <TABLE_NAME> found in 'objects' metadata table
Make sure you have rerun metadata datasets objects
, links
, fields
and diffs
after new tables are ingested and added to your SDDI pipeline.
No, you do not need to increase semantic version after adding new tables to Bellhop config files. However, you will need to rebuild metadata datasets objects
, links
, fields
, and diffs
.
Yes. The foreign key generation, enrichment stage, and renaming stage can be disabled using parameters in the PipelineConfig file. Incrementing the deploymentSemanticVersion
is required for changes to take effect.
Disabling any or all of those steps will result in data schema consequences and may cause breaks in downstream usage of the data.