This content is also available at learn.palantir.com ↗ and is presented here for accessibility purposes.
Your flight alert data has a priority
column that users rely on downstream for analysis and Ontology object generation. Let's assume it feeds an alert triage inbox that filters on priority
column values. You'll want to be intentional about preventing values other than "High," "Medium," and "Low" into your pipeline. Now that you've had a bit of practice with the primary key expectation check, you'll set an is_in
column expectation check on your flight_alerts_clean
, because it's the output of your flight alerts schedule.
Open your flight_alerts_logic
repository and create a new branch from Master
(e.g., yourName/feature/column_comparison
.
Use the is_in
syntax in this data expectations reference to fail the flight_alerts_clean
job if the priority
column values are not "High," "Medium," and "Low".
Consult the code example in the complex checks documentation for guidance on how to structure multiple checks.
Check
requires at least three arguments: (1) the expectation itself (which is what you'll pull from the documentation); (2) an arbitrary name in single quotes (e.g., 'My Primary Key Uniqueness Check'); and (3) an on_error
behavior of 'FAIL' or 'WARN'.Preview your code and note the expectations indicator and Details link on the left side of the Preview helper once your preview has materialized as shown in the image below.
Commit and build your code on your branch and view the check on the Health tab of the dataset (also on your branch).
Merge your code into master
and build.