Search documentation
karat

+

K

plans overview

Plans and Constraints

Plans

Plans are how Apollo delivers instructions from the Hub to agents in managed Environments. Each Plan is a unit of work, such as a configuration change or a Release upgrade, and will only be sent to an agent for execution when all relevant constraints have been satisfied.

Apollo supports the following Plan types:

Plans and constraints are surfaced in the Apollo Control Center, to ensure that Apollo’s workflows are human-readable. Users can see the history of Plans that an agent performed for a given Environment, each of which has a start and an end time.

Plans

Apollo's Plan-based paradigm is different than other "control loop" systems. Rather than directly carrying out actions in the background without a human noticing, Apollo generates a Plan, and once all constraints are satisfied, the Plan is sent to an agent for execution. This Plan-based paradigm provides more transparency and visibility to users for on-going, upcoming, and previous changes made in the Environment. You can view the history of Plans for an Apollo-managed Entity under the Plans tab of the Entity's home page.

Entity plans

Plan Lifecycle

Plan Lifecycle

Orchestration Engine

The Orchestration Engine in every Hub Environment:

  1. Continuously evaluates all the possible Plans for each Spoke.
  2. Evaluates all the constraints associated with each of the possible Plans.
  3. Issues Plans, whose constraints are satisfied, to the Spoke's agents for execution.

General guidelines for how the Orchestration Engine decides which Plan to issue

  • Plans to roll off a recalled release or to execute break-glass commands are prioritized over other Plan types.
  • Apollo will aim to bring the state of Entities to the latest possible state. For example, Apollo will issue a Plan to upgrade an Entity to the latest Release that passes all relevant constraints and change the Entity's configuration to the latest configuration override that matches the version. This allows performing upgrades in lockstep with their respective configuration. For more about configuration overrides for specific versions, see the documentation on Managing Entities.

Agents

Agents in every Spoke Environment:

  1. Continuously poll the orchestration engine for new Plans and report back the state of the Environment and its Entities.
  2. Execute the change(s) required in the Spoke for every issued Plan
  3. Report whether Plans succeeded or failed back to the orchestration engine.

Viewing a Plan's status

You can track the progress of a Plan in the Plans tab of the Entity or Environment home page. Select a Plan's status from the Status column to view the task(s) that form the Plan and their progress.

For Plans that are in progress, you can view:

  • Tasks that are in progress.
  • Tasks that completed and if they succeeded or failed.
A Plan that is in progress is selected. There is one task in progress.

For Plans that failed, you can view an error message for each task that failed. This enables you to identify the reason that the Plan failed and resolve any issues.

A Plan that has failed is selected. The tasks that have failed are displayed.

Plan failures, suppressions, and rollbacks

When Plans fail, Apollo will automatically create an Entity-level suppression window, which will prevent further work on that Entity. Apollo will continue to issue new Plans for other Entities in the Environment. If the number of Entities in an Environment that are under automated suppression windows exceeds the configured threshold, Apollo will automatically create an Environment-wide suppression window as this may be indicative of a system-wide problem. An Environment-wide suppression will prevent further work on all Entities in the Environment. Once the number of Entities under suppression windows returns to below the threshold, Apollo will automatically remove the Environment-wide suppression.

After a Plan had failed and an automatic suppression window was applied, Apollo will permit a “rollback” Plan to restore the old state of the world despite the automatic suppression window. This does not apply if the suppression window was created by a human - Apollo will always respect these.

What causes Apollo to invalidate and consider a new Plan for an Entity

Apollo will request a new Plan for an Entity when one of the following happens:

  • Changes to the Reported State of the relevant Entity's dependents or dependencies are observed.
  • Changes to the Entity settings for the Entity are observed. For example, the Release Channel the Entity is tracking was changed.
  • A Product Release is added to the Entity's configured Release Channel.
  • A Product Release that is in Entity’s configured Release Channel is recalled.
  • When an existing Plan has been blocked by constraints past the four hour threshold.

Plan invalidation due to changes to the Reported State of the Environment is based on the dependency graph of Entities in the Environment. Modifying an Entity creates new a Plan request for the Entity itself as well as its neighbors in the dependency graph, meaning its dependencies and dependents. Apollo requests Plans for an Entity’s dependencies and dependents because an Entity changing versions can change the versions to which its dependencies and dependents can upgrade due to product dependency constraints.

Consider the following example dependency graph for Entities in an Environment:

Plan Recomputation Example

Service A depends on Service B, which depends on Service C. For now, assume that these are the only dependencies that these services declare.

A modification to the Service C Entity would request a new Plan for Service C and Service B because it declares a dependency on Service C.

A modification to the Service B Entity would request a new Plan for Service C as a dependency, the Service B Entity itself, and Service A as a dependent Entity.

A modification to the Service A Entity would request a new Plan for Service B as a dependency and Service A itself.

Constraints

Constraints are conditions that Plans must satisfy before they can be executed by an agent. Apollo evaluates constraints to ensure that agents only execute Plans when it is safe and acceptable to do so.

Apollo’s constraint solving engine automates the validation that a human would have to do before performing an upgrade; for example, at the time of code review in a GitOps model.

Constraints are surfaced in the Apollo Control Center and help users understand why a Plan is blocked from starting (for example, when an upgrade is not happening). They also provide enough information so that a user can take an action to unblock the Plan from being carried out by an agent. Here is a screenshot of a "suppression window" constraint preventing a plan from being executed until the suppression window ends:

Constraints

Some example constraints are listed below.

Maintenance window constraint

Environment editors can define time windows during which it is acceptable for Apollo to make changes to Entities in their Environment, that is, install, upgrade, or change configuration. These are called Environment and Entity maintenance windows. Product editors can define maintenance windows for their Products that determine when Apollo can promote a Release to a Release Channel. These are called Product maintenance windows. Maintenance window constraints block Plans from making changes outside of the defined time window.

There are two types of maintenance windows in Apollo: downtime and no-downtime.

How Apollo calculates an Entity's resolved maintenance window

  • If the Product maintenance windows override is not set, which is the default behavior:
    • If the operation is no-downtime, use the Product maintenance window.
    • If the operation requires downtime, use the intersection of the Product maintenance window and the Entity’s downtime window.
  • If the Product maintenance windows override is set or no Product maintenance window is defined for the Product:
    • If the operation is no-downtime, use the Entity’s no-downtime window
    • If the operation requires downtime, use the Entity’s downtime window.

Users may also request approval for a temporary maintenance window override to allow Plans to proceed outside of the normal maintenance window. See Managing Environments for more details.

By default, Apollo will ignore no-downtime maintenance windows when rolling off a recalled Release. This is not true for roll-offs that require downtime where maintenance windows are respected. To override this behavior, set the require-maintenance-windows-for-no-downtime-recalled-release-roll-offs field to true in the Environment settings.

Users will often see that Plans on Entities are blocked by the constraint that “entity is not in maintenance window”. This means that Apollo is waiting for the next MW to apply the corresponding change to the Entity. The user can wait for the next MW, edit the MW schedule (if the user is an Environment editor), or create a maintenance window override (MWO). Refer to Managing Environments - Environment settings for information on how to perform these actions.

Blocked maintenance window

Product dependencies constraint

When publishing a Product to the Product Catalog, Product creators can include information about which other Products are required for proper functioning under the manifest. Product creators are required to specify the permitted version range of those dependencies. For more information see Managing Products - Publishing Helm Charts.

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 extensions: product-dependencies: - product-group: com.palantir.example product-name: example minimum-version: 1.0.0 maximum-version: 1.x.x - product-group: com.palantir.other product-name: other minimum-version: 2.53.1 maximum-version: 2.x.x

The Product Dependencies constraint ensures that Apollo will only execute a plan when:

  1. None of these dependencies would be violated by the proposed plan.
  2. All dependent Entities satisfy their version constraint with the target Product Release.

Accurately defining Product dependencies and relying on this constraint allows Product creators to safely roll software across many disparate Environments using Apollo, without them needing to manually validate whether every upgrade is safe or coordinate releases with other teams.

You can override a Product dependency constraint for an Entity by adding ignored dependencies in the Entity advanced settings. This is useful when some capabilities provided by the ignored dependency are not required and thus not deployed. Only use this option if you are aware of the exact impact the ignored dependency will have on the Product.

Suppression window constraint

Suppression windows prevent Apollo from issuing Plans during the configured time and supersede maintenance windows. Suppression windows can be set for either an Entity or the whole Environment. Suppression windows can be created in three ways:

  • Manually by users. Environment editors can set suppression windows for the whole Environment or for specific Entities, while Environment or Entity operators can set suppression windows only for Entities they have the role assigned to.
  • Automatically when a Plan failed for an Entity. These suppression windows are ignored when Apollo tries to rollback a failed Plan.
  • Automatically during release promotion. This is to avoid cases where newer releases continuously supersede previous releases, preventing a promotion from reaching the target soak time. These suppression windows are ignored when rolling off a recalled release.

Manual suppression windows should always be applied to an Entity before performing manual changes. Otherwise, Apollo could issue Plans that conflict with the changes and could cause unexpected behavior.

Learn more about managing suppressions windows for configuration and cancelling Plans.

Creating an Entity-level suppression window is the only supported way to cancel an active Plan.