Access requirements for platform resources are controlled by Markings. Markings restrict access in an all-or-nothing fashion: in order to access a resource, a user must be a member of all Markings applied to the resource. In addition, Markings are inherited along both the file hierarchy and direct dependencies. If you have the correct permissions, you can remove Markings directly from resources and along direct dependencies.
Markings are frequently used because they are legible throughout the platform and propagate along direct dependencies, protecting sensitive data. In some circumstances, a Marking may be applied early in a pipeline and need to be removed later in a pipeline. This page provides more information on how to remove Markings depending on your pipeline structure.
Below are three scenarios related to applying a Marking early in a pipeline and removing it later in a pipeline.
This scenario is for when the pipeline:
Therefore, you are migrating from using severing (a deprecated feature) to removing an inherited Marking.
In the old state shown in the example above, severing has been used to prevent the Marking from propagating. Assuming that severing is only being used to remove a Marking, we strongly recommend that you replace severing with Marking removal, as in the new state in the example above. When removing the Marking, it is useful to think about the approval mode configuration of the repository which contains the Marking removal transform.
In the case that propagate view requirements is enabled, read scenario 2 below.
This scenario involves applying Markings to a dataset in your pipeline in order to disable the project-level propagate view requirements settings.
New Projects have the Propagate View Requirements option disabled by default, as seen above. For these new Projects, view requirements will not be enforced for downstream derived datasets. Specifically, this means that users accessing a downstream version of the data in a separate Project would not also require access to the upstream data in the Projects where this configuration is disabled.
Markings always propagate. If data in a new Project has a Marking, that Marking will still propagate to all downstream datasets, regardless of the "propagate view requirements" setting.
If you have Projects with the Propagate View Requirements option enabled as in the image above, then view requirements have propagated for datasets in these Projects. This means that users accessing a downstream version of the data in a separate Project would additionally require access to the upstream Project(s) with this config enabled.
We highly recommend disabling view requirement propagation in favor of using Markings.
Before disabling view requirement propagation and introducing Markings to your pipeline, it is worth considering the original purpose of enabling propagating view requirements:
In the example below, our goal is to disable "propagate view requirements" on the Datasource Project. After following the steps above, we learned that the reason "propagate view requirements" is enabled on the project was to protect the raw_dataset_1
dataset because it has sensitive data.
In the old state, viewing the contents of Dataset A would require at least “viewer” access on both the Datasource Project and the Downstream Project. Subsequently, severing has been used to remove the view requirement propagation on Dataset B.
In the new state, viewing contents of Dataset A requires at least “viewer” access on the Downstream Project and access to the Marking. Note that with "propagate view requirements" disabled, requiring access to Datasets C & D only requires “viewer” access on the Downstream Project.
This change allows disabling of "propagate view requirements" by using a security Marking. In the proposed solution below, we’ll apply a Marking on raw_dataset_1
which is then immediately propagated to all downstream datasets which have any non-severed transaction of raw_dataset_1
as an input. There is an assumption that severing was already in place, and severing is only being replaced by Marking removal. If this is not true in your situation, see Scenario 3, where we discuss implications of applying a Marking to a dataset in detail.
The following steps are recommended for introducing this change so that users don’t lose access to Dataset B when the Marking is added after disabling "propagate view requirements" in the Datasource Project:
raw_dataset_1
.This potentially complex scenario involves introducing a new Marking early in an existing pipeline without accidentally locking out users later in the pipeline.
It is critical to note that the Marking introduced on Dataset A will immediately propagate to all resources that are downstream of that dataset along the transaction lineage. Users will require the marking to access anything derived from the marked dataset.
To understand this better, let’s extend the example above with a pipeline as follows: Dataset A → Dataset B → Downstream Datasets:
Our goal in this example is to ensure that Downstream Datasets never inherit the Marking from Dataset A.
The first thing we need to understand is that marking Dataset A (or a folder enclosing Dataset A) effectively marks all transactions in the entire history of Dataset A. As a consequence, Dataset B and Downstream Datasets inherit the Marking immediately.
If we perform the following steps:
... then the latest snapshot transaction on Dataset B will be Marking-free, but all older transactions on Dataset B will still be marked.
Note that while marking a dataset will mark all of its transactions, removing a Marking in transforms will only remove the Marking for new output transactions. Markings will not be removed from existing transactions. This behavior is not symmetrical.
This means that any data in the Downstream Datasets derived from an older transaction on Dataset B, such as Downstream datasets that are built incrementally, will still inherit the Marking.
To ensure each incremental Downstream dataset is unmarked, everything between the unmarking transform on Dataset B and the incremental Downstream dataset must be re-snapshotted after the unmarking transform is applied on Dataset B. This will ensure that each Downstream dataset depends only on the latest (unmarked) transaction on Dataset B, rather than earlier (marked) transactions. A simple rebuild is not sufficient.
If the number of Downstream datasets is infeasible for manually triggering re-snapshotting, we suggest the following steps:
Consider the performance effects of doing this swap as it could trigger a SNAPSHOT
build of Dataset B.
If you are making these changes in an important pipeline or are unsure about any of these steps, contact your Palantir representative for assistance.
Copied!1 2 3 4 5 6
# column-based df.select("salary","title","department") # row-based states_to_keep =["OH","CA","DE"] df.filter(df.state.isin(states_to_keep))
Copied!1 2 3 4 5 6
# column-based df.drop("firstname","lastname") # row-based states_to_drop =["FL","TX","IL"] df.filter(~df.state.isin(states_to_drop))
This depends on whether the require re-approval or don't require re-approval approval mode is set on the repository which has your Marking removal transform. Learn more about approval modes.
Ideally, every time the logic of a Marking removal transform changes, it should undergo a security approval. To balance between excessive friction in the approval process and good security posture, we recommend that, if you can, move all transforms with Marking removal logic (such as obfuscating data, removing columns, and so on) to a separate repo and set the separate repo to “Require re-approvals”.
No, these are input properties and cannot be added on outputs. If your goal is to remove a Marking from a certain output, you need to identify all inputs that carry a Marking and add stop_propagating
statements to them respectively. For more details, refer to the input transform property documentation.
The following languages support Marking removal:
Declassification should be carried out carefully and not scattered around Projects and repositories. Marking removal features in the platform provide granular control over permission propagation changes and ensure that such changes are appropriately reviewed.
Contact your Palantir representative to disallow adding severing on datasets that have not had severing enabled before.