Markings and Organizations restrict access to resources based on a user's eligibility.
When restricted content is removed or obfuscated while deriving a dependent resource, users may wish to remove the Marking and/or Organization on that derived resource. This process of removing inherited Markings and Organizations can be done by using the stop_propagating
and stop_requiring
input transform properties.
stop_propagating
is used to remove inherited Markings (e.g. PII).stop_requiring
is used to remove inherited Organizations (e.g. Palantir).stop propagating
and stop requiring
key phrases only apply to Organizations and Markings and NOT Roles.on_branches=[..., "not-protected-branch"]
) will cause the build to fail.Remove marking
permissions to remove Markings and Expand access
permissions to remove Organizations.The gray dataset boxes in Project C below highlight the fact that Project References must be added for all the inputs in the destination Project.
Create a new branch off of a protected branch (for example, main).
Add one or both of the stop_propagating
and stop_requiring
properties to the input transform. For example:
Create a pull request to merge this code into a protected branch.
A user with either Remove marking
permissions for Markings or Expand access
for Organizations can approve or reject the proposed changes. If multiple reviewers are added, rejection by any reviewer will result in rejection of the entire pull request.
If approved, the code editor merges the PR and builds the output dataset. After the output dataset is built, it will no longer have the propagated Markings and/or Organizations.
Internally, Organizations are represented as a slightly different kind of Marking, hence the transforms keyword following stop_requiring
is called OrgMarkings
.
To remove inherited Markings (e.g. PII), use the stop_propagating
keyphrase.
To remove inherited Organizations (e.g. Palantir), use the stop_requiring
keyphrase.
Each of these keyphrases must be specified on every input that requires removal of Markings or Organizations. For every removal, you must also specify the protected branches to which the removal should apply. Marking IDs, Organization IDs, and branches should always be specified as quoted strings.
You need to provide at least one upstream Organization, since users only need to satisfy at least one Organization. Approvals will be required for each listed Organization. The detailed workflow below provides an example illustrating this point.
In Python, Marking removal is specified in the input constructor.
Copied!1 2 3 4 5 6
@transform( input_1=Input("<input_id>", stop_propagating=Markings([markingId1, ...], [branch1, ...]), stop_requiring=OrgMarkings([orgMarking1, ...], [branch2, ...])), output=Output("<output_id>") )
The Markings
class takes a list of Marking IDs and a list of protected branches on which to apply the marking removal. Marking IDs can be found in the Markings
list on the Settings
page.
The OrgMarking
class takes a list of Organization IDs and a list of protected branches on which to apply the Marking removal. Organization IDs can be found in the Organizations
list on the Settings
page.
In Java, Marking removal is specified via annotations on the inputs for automatically registered transforms.
Syntax:
Copied!1 2 3 4 5 6 7 8
@Compute public void myComputation( @StopPropagating(markings = {markingId1, ...}, onBranches = {branch1, ...}) @StopRequiring(orgMarkings = {orgId1, ...}, onBranches = {branch2, ...}) @Input("<input_id>") FoundryInput input, @Output("<output_id>") FoundryOutput output)
The @StopPropagating
and @StopRequiring
annotations take a set of Marking IDs and a set of protected branches on which to apply the Marking removal.
When only one Marking or branch is specified you do not need to wrap it in {}
(e.g. @StopPropagating(markings = marking1, onBranches = "my-branch")
).
For manually registered Java transforms, we use the following syntax to specify unmarkings during registration in the MyPipelineDefiner.java
file.
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
@Override public void define(Pipeline pipeline) { HighLevelTransform highLevelManualTransform = HighLevelTransform.builder() .computeFunctionInstance(new HighLevelManualFunction()) .putParameterToInputAlias("myInput", "/path/to/input/dataset") .returnedAlias("/path/to/output/dataset") .desiredUnmarkings(Set.of( Unmarking.builder() .branch("branch1") .input(alias("/input1")) .output(alias("/output")) .markingId(MarkingId.valueOf("markingId")) .build(), Unmarking.builder() .branch("branch1") .input(alias("/input1")) .output(alias("/output")) .markingId(MarkingId.valueOf("orgId1")) .build() )) .build(); pipeline.register(highLevelManualTransform); }
In SQL, Marking removal is specified by using SparkSQL hint statements:
Copied!1 2 3
CREATE TABLE <output_id> AS SELECT /*+ foundry_stop_propagating(markingId1, ...) foundry_stop_requiring(orgMarkingId1, ...) foundry_on_branches(branch1, ...) */ * FROM <input_id>
Marking and Organization removal in SQL can be added to any SELECT
statement. For example:
Copied!1 2
CREATE TABLE <output_id> AS SELECT * FROM <input_1_id> CROSS JOIN (SELECT /*+ foundry_stop_propagating(markingId1) foundry_on_branches("my-branch") */ * FROM <input_2_id>)
To be able to view code and approve pull requests in general, the
approver must pass any Organizations and Markings on the Project and the
repository itself, as well as having a Role that includes the basic Stemma View Repository
workflow (by default, included in the Viewer role). Users must also have permissions on each Organization and Marking to set approval modes or approve pull requests removing those Organizations and Markings. Users do not necessarily need to be members of the Organization or Marking.
For a Marking approval, the user approving needs to have the Remove marking
role on the Marking.
For an Organization approval, the user approving needs to have the Expand access
role on the Marking.
For each repository and each Organization and Marking, a data governance user can define which mode should be used to trigger a new approval:
Above is a transform in a repository with one Marking PHI
that requires re-approval.
Given the above setup, the following will happen:
Remove
role on the PHI Marking.Above is a transform in a repository with one organization, PALANTIR
, that does NOT require re-approval.
Given the above setup, the following will happen:
PALANTIR
Organization, they will be required to get approval from someone with the Expand access
role on the PALANTIR
Organization.Transform 1: Above is a transform with one Marking, PII
, and one Organization, PALANTIR
. The PII
Marking requires re-approval and the PALANTIR
Organization does not require re-approval.
Transform 2: Above is a transform with one Marking, USA
, that does NOT require re-approval.
Given the above setup, here's what will happen:
Remove marking
role on the PII
Marking AND a user with Expand access
role on the PALANTIR
Organization.Remove marking
role.
on the PII
Marking.Remove marking
role on the USA
Marking AND a user with Remove marking
role on the PII
Marking.Remove marking
role on the PII
Marking.Remove marking
role on the PII
Marking.In this example scenario, a code editor wants to use two datasets from a sensitive upstream project, remove certain information, and allow a wider audience to access the resulting dataset. The two datasets, which have two Markings each, have been added as references in the downstream Project. The code editor wants three of the four Markings to stop propagating, such that they do not appear on the output dataset. In addition, the upstream Project is restricted to users from OrgA or OrgB, and the intent is to distribute the downstream data to users from OrgC.
Before: The output dataset on the code editors branch has inherited all four Markings and is still restricted to users from OrgA or OrgB.
After: The output dataset, once merged into a protected branch (for example, main), now has only one inherited Marking and does not require users to be members of OrgA
or OrgB
.
feature/clean-data
branch of the repository of the downstream Project.Since all the marking changes are being requested for the master
branch, no approvals are needed to work on feature/clean-data
. In other words, when the output dataset is built on the feature/clean-data
branch, all the upstream Markings will still be inherited.
The code editor creates a pull request to the main branch and requests approvals from the data governance users, who manage data restricted by the lemon
, apple
, and cherry
Markings. The code editor also requests an approval from an Expand access
Organization administrator from OrgA, who can approve when OrgA data needs to be shared with other Organizations.
Expanding access by removing an inherited Organization has the effect of removing all inherited Organizations, but approval is only required from users with the appropriate permissions on the Organizations listed in the transform. If you want to require approval from all of the Organizations, then you need to list all of the IDs in the `stop_requiring` component. In this example, OrgA is primarily responsible for the data in the upstream Project, so the editor chose OrgA for the cross-Organization approval process. As such, the editor only needs approval from an OrgA admin to remove inherited Organizations. Depending on which Organization approvals the editor wants to request, the editor can choose to `stop_requiring` either: (1) OrgA (with approval by an OrgA admin), (2) OrgB (with approval by an OrgB admin), or (3) both OrgA and OrgB (with approval from admins from both Organizations).
The end result is that the output dataset will not inherit any Organizations from the inputs and will only respect the Organizations from the Project in which it is located.
The data governance users and Organization administrators receive Foundry notifications that their approval has been requested.
Assuming the PR is approved, the code editor merges it and builds the output dataset as shown in the After image above.
The following week, another code editor makes a change to a different code file and opens a PR to merge to master
.
If all the Markings do not require re-approval, the PR can be approved without going through a security review. If any of the Markings do require approval, this new PR will require a security review by the data governance or organization administrator who manages that Marking.