8. [Builder] Ontology Data Pipelines29. Additive Backing Dataset Changes Part 1

29 - Additive Backing Dataset Changes: Part 1

This content is also available at learn.palantir.com ↗ and is presented here for accessibility purposes.

📖 Task Introduction

In this exercise, you'll practice addressing two primary scenarios involving changes to backing datasets and Ontology configurations.

  1. Additive changes to your backing dataset.
  2. Destructive changes to your backing dataset.

The title key for your passenger object type is simply the last name of the passenger. Let's create a new column in the backing dataset called full_name that we can swap in for the title key. In so doing, we'll witness what happens in the Ontology sync process when the backing dataset receives a new column.

🔨 Task Instructions

  1. Open your ontology_flight_alerts_logic pipeline artifact.

    • ⚠️ You'll typically want to branch from Main when making changes like this, but for convenience, you'll be making your changes directly to Main.
  2. Add a Concatenate strings transform to passengers_clean that combines first_name and last_name separated by a blank space and call the new column full_name.

    • Note that you can add a transform in between two nodes either by clicking the + sign in between the nodes (see image below) or by later on changing the inputs and outputs of nodes via the white and grey circles at the ends of each node connection.
  3. Consider naming your transform node (e.g., "Concat Names").

  4. Apply and preview the change.

  5. Back on your pipeline graph, color your transform node and change the passengers output uses the input schema as show in the image below.

  6. Deploy your pipeline.

  7. Once your dataset builds complete, open the output passengers dataset and proceed to the Syncs section of the Details tab as shown in the image below. Here, we can see that the sync between the dataset and the object storage service (aka "Phonograph") was successful despite the schema change.

While you're here, you could also access the Health tab and see that the Schema Check you put in place earlier passed. Because we set the check to COLUMN_ADDITIONS_ALLOWED_STRICT, the check added the new column.