This content is also available at learn.palantir.com ↗ and is presented here for accessibility purposes.
Why didn't we just use the clean
versions of the untouched datasets as inputs to the Ontology?
Clean
datasets tend to be starting points for many activities in Foundry, including analysis, modeling, and other data pipelines. They typically resemble raw
data closely and as such may contain many more columns than we need for our Ontology object and link types but that are nonetheless valuable for these other workflows. We may also eventually decide to add new derived columns to our Ontology-backing dataset, and we may want to make those changes without affecting the clean
version. This intermediate transform step ( clean
→ ontology
) is always recommended, even in cases where it initially feels like a formality.
You've now added transformation steps in your pipeline that should be documented, scheduled, and monitored per the practices you've learned in this training track. Test your knowledge by following these summary recommendations.
/Ontology Project: Flight Alerts/documentation/
.Apply the following health checks to each of your three new ontology datasets and add them to the associated check groups:
COLUMN_ADDITIONS_ALLOWED_ STRICT
).flight_alerts_passenger
, configure the check to verify the combination of alert_display_name
and passenger_id
.We'll return to add a final check after configuring the object and link types in the Ontology. Note that all of these new datasets are automatically added to the existing Schedule Status and Schedule Duration checks.