6. Monitoring Data Pipeline Health14. Key Takeaways

14 - Key Takeaways

This content is also available at learn.palantir.com ↗ and is presented here for accessibility purposes.

Data pipelines feed many different products and processes in or outside of Foundry. They may prepare data for in-depth analysis and presentation, export tables to a system outside of Foundry, or back operational, Ontology-aware applications. Building your pipelines deliberately with the end in mind will help you be intentional about the kind of monitoring you want to have in place. Through the Foundry Scheduler application, you define the inputs, outputs and intermediate datasets of a pipeline, and the Data Health application enables you to set checks on those inputs and outputs to monitor pipeline health.

In this tutorial you:

  1. Created check groups to batch health notification alerts.
  2. Applied recommended health checks to the inputs and outputs of your connected pipeline segments.
  3. Applied health checks to your three build schedules.

Below is a list of product documentation used in the course of this training:

Data engineers who want to monitor pipelines and datasets with more fine grained precision can define code-based checks in a transform repository with a framework called Data Expectations. The next tutorial will help you explore the expectations library and help you apply a select few to your pipeline.