Monitoring at scale introduces new capabilities that make monitoring Foundry resources less time-intensive.
If you are already using check groups, think of this as an additional option for monitoring your resources. It will not replace any workflows or check groups you have already set up.
You can monitor resources in two ways:
To upgrade an existing check group, open your check group in the Data Health application. In the top banner, select Upgrade to Monitoring View.
You can create a new monitoring view or move all the checks to an existing monitoring view.
To create a new monitoring view, go to the Monitoring View tab in the top right corner of the Data Health app and create a new monitoring view.
To create a monitoring rule, navigate to the Manage monitors tab. First, select the resource type you are looking to monitor. Depending on the resource type, you can either choose to monitor just that resource on a single scope, or you can monitor all the resources of that type across a single or multiple Project scope.
You must have Viewer
permission on the resources to monitor them. To receive alerts triggered by monitoring rules, you must have Viewer
permission on the resources and the monitoring view.
Monitors are set on the metrics a resource emits. As you set up your monitors, we suggest certain configurations based on Foundry’s standards for health. However, you can change the values or choose to only monitor certain metrics. You can also determine the level of severity the alert will have when it fails. Currently there are three severity types: low, medium, and high.
You can edit your monitors by selecting from the list of monitors and choosing Edit
on the side panel that appears.
To subscribe to alerts, navigate to the Manage subscriptions tab where all the subscribed users are listed. You can add users and user groups, and configure their alerts based on severity. When a monitor rule triggers an alert, the user subscribed to the monitoring view containing that alert will be notified via email and Foundry notifications. Note that you must have Viewer
permission on the resources and the monitoring view to be able to receive alerts.
You can send alerts to external systems such as PagerDuty or Slack with built-in integrations or by using a webhook to hit arbitrary REST endpoints. Learn more about sending alerts to external systems.
You can monitor the following:
Resource type | Supported scope |
---|---|
Agent | Single, Project |
Object type | Single |
Link type | Single |
Schedule | Single, Project |
Streaming datasets | Single, Project |
Live deployments | Project |
Time series syncs | Single |
Geotemporal observations | Single |
Automations | Single, Project |
Dataset (coming soon) | Project |
A reference can be found here
Not all health checks exist as monitoring rules, but the most important health checks have analogous monitoring rules. We recommend using a combination of monitoring rules and health checks in a linked check group. To summarize coverage from monitoring views and health checks:
For the most comprehensive coverage, we suggest linking your monitoring view to a check group that consists of health checks not currently available in monitoring views.
Monitors cover an entire scope rather than a single resource. This means that when an additional resource is added to that scope, it is automatically covered by the rule. For example, a monitoring rule that is set up to monitor all agents in a Project will also monitor any further agents added into that Project at a later time.
A good practice is to think of a single monitoring view the same way you would think of a check group. One monitoring view should relate to a set of users who care about the monitors that are in that view. If a specific set of users [a, b, c] cares about specific Projects [x, y, z], create a single monitoring view with all the resources in those Projects. If a specific set of users only care about monitoring agents, you should create a single monitoring view to monitor all agents in all Projects.
Since a monitoring view is a filesystem resource, a user will need permission to the Project or folder in which the view is saved. To receive alerts or set up monitoring rules on a resource, the user will need access to the Project resources they wish to monitor. Even if a user with all necessary permissions subscribes a user or group to a monitoring view, those new subscribers will NOT receive alerts on any resources if they do not have explicit access permissions to that monitoring view.