6. Monitoring Data Pipeline Health1. About This Course

1 - About this Course

This content is also available at learn.palantir.com ↗ and is presented here for accessibility purposes.

Context

Production pipelines require stability, and Foundry offers a suite of configurable checks and notification tools to keep you and your team apprised of deviations from expected behavior. One such tool is the Data Health service, which provides a suite of prebuilt checks on various aspects of your datasets. And if those datasets are, for example, targets of your schedule build, then they also give signal about the overall health of the pipeline.

⚠️ Course prerequisites

  • DATAENG 05: If you have not completed the previous course in this track, do so now.

Outcomes

This tutorial is about giving you hands-on experience implementing best practices for monitoring production pipelines using Foundry’s Data Health service. The goal by the end of this training is to equip you with everything you need to apply the right checks at the right parts of your pipeline for optimal health and performance.

🥅 Learning Objectives

  1. Know where and how to apply data health checks.
  2. Learn and apply recommended data health checks to key parts of your pipeline.
  3. Know where to find metrics that might help you tune your checks.
  4. Understand the notification and alerting framework.

💪 Foundry Skills

  • Configure dataset health checks in the Data Health and Data Lineage applications.
  • Configure schedule health checks in the Scheduler application.
  • Use schedule metrics to update your checks as needed.
  • Configure group checks for batched alerting.