This content is also available at learn.palantir.com ↗ and is presented here for accessibility purposes.
In your new repository, you'll be transforming three input datasets to prepare them to back Ontology object and link types.
flight_alerts_clean
: This will back our flight alert object type, but first we want to remove the rule_id
column, since it's not needed in any anticipated workflows (and reducing the amount of data to be synchronized to the Ontology storage service also reduces computation load).passengers_clean
: We determined this dataset requires no updates at this point, so we'll pass it through as an identity transform.passenger_flight_alerts_clean
: There is a many-to-many relationship between passengers and flight alerts. Just as with many-to-many joins in a relational database, a join table is needed to back many-to-many link types in the Ontology. We'll therefore also need to prepare this dataset, which is already a part of our pipeline (and which we'll assume also needs no further preparation)./datasets/examples.py
by clicking the ...
next to the file name and choosing Delete from the menu of options./data
called flight_alerts.py
.flight_alerts_clean
(your Foundry environment will likely contain many flight_alerts_clean
datasets, so double check that you've got the one you've created in your pipeline). Ensure that your Output location is .../Ontology Project: Flight Alerts/data/ontology/...
, creating these subfolders if necessary (see image below)rule_id
column by, for example, calling .drop('rule_id')
on the returned dataframe./datasets
for passengers.py
and passenger_flight_alerts.py
.passengers.py
is set to your passengers_clean
and that passenger_flight_alerts.py
uses your passenger_flight_alerts_clean
dataset as input.