5B. [Repositories] Publishing and Using Shared Libraries in Code Repositories12. Create Clean Passengers Output Datasets Part 1

12 - Create clean “passengers” output datasets, part 1

This content is also available at learn.palantir.com ↗ and is presented here for accessibility purposes.

📖 Task Introduction

In the previous tutorial (Working with Raw Files) you created a Datasource Project: Passengers project and associated passengers_logic code repository that so far has produced a set of raw and preprocessed passenger datasets. In this task, we’ll prepare to create a clean output as we did for the flight alerts datasource project.

🔨 Task Instructions

  1. Open your /Datasource Project: Passengers folder and click on your passengers_logic repository to open it.
  2. Create a new branch from Master called yourName/feature/clean_data (e.g., jmeier/feature/clean_data).
  3. We’ll be using your new shared cleaning library, so using the steps followed in the prior exercises (in your flight_alerts_logic repository), add your cleaning library to your repository (hint: start by clicking the Libraries icon on the upper left side of your screen).
  4. Return to your repository Files and right click on the /datasets file (e.g., /transforms-python/src/myproject/datasets/) and choose New folder.
  5. Name your new folder clean.