2. [Repositories] Introduction to Data Transformations9. Exercise Summary

9 - Exercise Summary

This content is also available at learn.palantir.com ↗ and is presented here for accessibility purposes.

✅ What you built

  • A code repository dedicated to the Datasource stage of your pipeline.
  • Transform files for three raw inputs: flight_alerts_raw, status_mapping_raw, and priority_mapping_raw.
  • Datasets built on your feature branch corresponding to each transform file.

✅ What you learned

  • Each stage in your pipeline should have a dedicated code repository where you develop and maintain the code-based data transformations in a structured setting.
  • Foundry repositories enable code branching and management with Git, which introduces structure and oversight to code changes.
  • If your transform uses a dataset from another Foundry project, you’ll need to explicitly make a project reference.
  • Code Assist runs in parallel with your code repository session and provides auto-completion, compilation errors, and other IDE-like features.
  • Each repository commit and/or build initiates a CI check that ensures pipeline hygiene.
  • When datasets builds are initiated from a code repository, they are built on whatever code branch the repository is on when executed.
  • You should use uniform branch names across all stages of your pipeline to ensure downstream branches read from the correct upstream branch.
  • Your repository uses the Shrinkwrap file to map input/output paths to actual dataset resource IDs (RID). You can, however, replace those paths with the RIDs using the prompt in the code editor.