7. [Repositories] Configuring Data Expectations5. Column Expectations Is In
Feedback

5 - Column Expectations: “Is In”

This content is also available at learn.palantir.com ↗ and is presented here for accessibility purposes.

📖 Task Introduction

Your flight alert data has a priority column that users rely on downstream for analysis and Ontology object generation. Let's assume it feeds an alert triage inbox that filters on priority column values. You'll want to be intentional about preventing values other than "High," "Medium," and "Low" into your pipeline. Now that you've had a bit of practice with the primary key expectation check, you'll set an is_in column expectation check on your flight_alerts_clean, because it's the output of your flight alerts schedule.

🔨 Task Instructions

  1. Open your flight_alerts_logic repository and create a new branch from Master (e.g., yourName/feature/column_comparison.

  2. Use the is_in syntax in this data expectations reference to fail the flight_alerts_clean job if the priority column values are not "High," "Medium," and "Low".

  3. Consult the code example in the complex checks documentation for guidance on how to structure multiple checks.

    • Tips
      • Place quotation marks around your string column values (e.g., “High”)
      • Remember that a Check requires at least three arguments: (1) the expectation itself (which is what you'll pull from the documentation); (2) an arbitrary name in single quotes (e.g., 'My Primary Key Uniqueness Check'); and (3) an on_error behavior of 'FAIL' or 'WARN'.
  4. Preview your code and note the expectations indicator and Details link on the left side of the Preview helper once your preview has materialized as shown in the image below.

  5. Commit and build your code on your branch and view the check on the Health tab of the dataset (also on your branch).

  6. Merge your code into master and build.