7. [Repositories] Configuring Data Expectations7. Schema Expectations

7 - Schema Expectations

This content is also available at learn.palantir.com ↗ and is presented here for accessibility purposes.

📖 Task Introduction

For your final exercise, you’ll place a limited schema expectation on the flight_alerts_by_country.py transform. This task will require careful attention to syntax, remembering the arguments that need to be passed into a check and the special considerations for schema expectations.

🔨 Task Instructions

  1. Open your flight_alert_metrics_logic repository to the flight_alerts_by_country.py transform file and branch from Master.

  2. Add the appropriate import statements, including importing types as t from pyspark.sql.

  3. Use the syntax guidance in the documentation to build a schema check that verifies the schema contains flight_date and alert_priority and that their types are DateType() and StringType() respectively.

  4. Set this check to WARN instead of FAIL on error.

  5. Preview, commit, and build your code on your branch, merging into Master when appropriate. be sure to confirm that the expectation is set on each of the generated datasets.