Audit logs are the primary way for an auditor to understand what actions have been taken by Foundry users.
Audit logs in Foundry contain enough information to infer:
In some cases, audit logs will contain contextual information about users including Personal Identifiable Information (PII), such as names and email addresses. As such, audit log contents should be considered sensitive and viewed only by persons with the necessary security qualifications.
Audit logs are typically consumed into a purpose-built system for security monitoring (a "security information and event management", or SIEM solution) owned by the customer.
This guide will explain the process of extracting and using audit logs in two sections:
Customers are strongly encouraged to capture and monitor their own audit logs via the mechanisms presented below. See Monitoring Security Audit Logs for additional guidance.
Audit logs are delivered for downstream consumption via several mechanisms depending on a customer's security infrastructure and SIEM requirements. Batches of audit logs produced by Foundry services are compiled, compressed, and moved to a log bucket within 24 hours (often S3, although environment-dependent). From here, Foundry can deliver logs directly to customer for consumption via Audit Export To Foundry.
Audit logs can be exported, per-organization, directly into a Foundry dataset. As part of the configuration setup, an organization admin chooses where within the Foundry file system this audit log dataset will be generated.
Once log data has landed in a dataset, a customer may choose to export the audit data to an external SIEM via Foundry's Data Connection application.
To export audit logs, a user will need audit-export:orchestrate-v2
operation on the target organization(s). This can be granted via the Organization administrator role in Control Panel, under the Organization permissions tab. See Organization Permissions for more details.
To set up Audit Export to Foundry:
Note that for larger stacks, builds in the first several hours may produce empty append transactions. This is expected behavior as the pipeline processes a backlog of audit logs.
Due to the sensitivity of audit logs, it is highly recommended that the created dataset is restricted on a need-to-know basis and is only accessible by persons with necessary qualifications. Use markings to restrict your audit log dataset and to specify the set of platform administrators who can view potentially sensitive usage details like identifying information or search queries.
To disable an export, move the audit log dataset to the trash or to another Project.
Moving an audit log dataset will stop any further builds of that dataset. There is no way to restart these builds, even if the dataset is subsequently restored from the trash or moved back to the original Project.
On build, audit log datasets follow a specific set of conditions to append new logs as they become available (subject to change):
Audit log datasets can contain very high volumes of data, so we recommend filtering down this dataset using the time
column before performing any aggregations or visualizations. For any filtering, we recommend using Pipeline Builder or Transforms as audit datasets may be too large to effectively analyze in Contour without filtering them first.
All logs that Palantir products produce are structured logs. This means that they have a specific schema that they follow, which can be relied on by downstream systems.
Palantir audit logs are currently delivered in the audit.2
schema, also commonly refered to as "Audit V2". An updated schema, audit.3
or "Audit V3" is in development but is not yet generally available.
Within both audit.2
and audit.3
schemas, audit logs may vary depending on the service that produces it. This is because each service is reasoning about a different domain, and thus will have different concerns that it needs to describe. This variance is more noticeable in audit.2
, as will be explained below.
Service-specific information is primarily captured within the request_params
and result_params
fields. The contents of these fields will change shape depending on both the service doing the logging and the event being logged.
Audit logs can be thought of as a distilled record of all actions taken by users in the platform. This is often a compromise between verbosity and precision, where overly verbose logs may contain more information but be more difficult to reason about.
Palantir logs include a concept called audit categories to make logs easier to reason about with little service-specific knowledge.
With audit categories, audit logs are described as a union of auditable events. Audit categories are based on a set of core concepts, such as data
versus metaData
versus logic
, and divided into categories that describe actions on those concepts, such as dataLoad
(loading data from the system), metaDataCreate
(creating a new piece of metadata that describes some data), and logicDelete
(deleting some logic within the system, where the logic describes a transformation between two pieces of data).
Audit categories have also gone through a versioning change, from a looser form within audit.2
logs to a stricter and richer form within audit.3
logs. See below for more detail.
Refer to Audit log categories for a detailed list of available audit.2
and audit.3
categories.
Audit logs are written to a single log archive per environment. When audit logs are processed via the delivery pipeline, the User ID fields (uid
and otherUids
in the schema below) are extracted, and the users are mapped to their corresponding organizations.
An Audit Export orchestrated for a given orchestration is limited to audit logs attributed to that organization. Actions taken solely by service (non-human) users will not typically be attributed to any organization as these users are not organization members, except service users for Third Party Applications using Client Credentials Grants and used only by the registering organization, which will generate audit logs attributed to that organization.
audit.2
logs have no inter-service guarantees about the shape of the request and response parameters. As such, reasoning about audit logs must typically be performed on a service-by-service basis.
audit.2
logs may present an audit category within them that can be useful for narrowing a search. However, this category does not contain further information or prescribe the rest of the contents of the audit log. Additionally, audit.2
logs are not guaranteed to contain an audit category. If present, categories will be included in either the _category
or _categories
field within request_params
.
The schema of audit.2
log export datasets is provided below.
Field | Type | Description |
---|---|---|
filename | .log.gz | Name of the compressed file from the log archive |
type | string | Specifies the audit schema version - "audit.2" |
time | datetime | RFC3339Nano UTC datetime string, e.g. 2023-03-13T23:20:24.180Z |
uid | optional<UserId> | User ID (if available); this is the most downstream caller |
sid | optional<SessionId> | Session ID (if available) |
token_id | optional<TokenId> | API token ID (if available) |
ip | string | Best-effort identifier of the originating IP address |
trace_id | optional<TraceId> | Zipkin trace ID (if available) |
name | string | Name of the audit event, such as PUT_FILE |
result | AuditResult | The result of the event (success, failure, etc.) |
request_params | map<string, any> | The parameters known at method invocation time |
result_params | map<string, any> | Information derived within a method, commonly parts of the return value |
Audit V3 is under development and is not yet generally available.
audit.3
logs establish stricter usage of audit categories to reduce the need to understand the particular service when reasoning about log contents. audit.3
logs are produced with the following guarantees in mind:
dataLoad
describes the precise resources that are loaded.audit.3
schema. For example, all named resources are present at the top level, as well as within the request and response parameters.These guarantees mean that for any particular log it is possible to tell (1) what auditable event created it and (2) exactly what fields it contains. These guarantees are service-agnostic.
The audit.3
schema is provided below. This information is non-exhaustive and subject to change:
Field | Type | Description |
---|---|---|
environment | optional<string> | The environment that produced this log. |
stack | optional<string> | The stack on which this log was generated. |
service | optional<string> | The service that produced this log. |
product | string | The product that produced this log. |
productVersion | string | The version of the product that produced this log. |
host | string | The host that produced this log. |
producerType | AuditProducer | How this audit log was produced; for example, from a backend (SERVER) or frontend (CLIENT). |
time | datetime | RFC3339Nano UTC datetime string, for example 2023-03-13T23:20:24.180Z . |
name | string | The name of the audit event, such as PUT_FILE. |
result | AuditResult | Indicates whether the request was successful or the type of failure; for example, ERROR or UNAUTHORIZED. |
categories | set<string> | All audit categories produced by this audit event. |
entities | list<any> | All entities (for example, resources) present in the request and response params of this log. |
users | set<ContextualizedUser> | All users present in this audit log, contextualized. ContextualizedUser : fields:
|
requestFields | map<string, any> | The parameters known at method invocation time. Entries in the request and response fields will be dependent on the categories field defined above. |
resultFields | map<string, any> | Information derived within a method, commonly parts of the return value. |
origins | list<string> | All addresses attached to the request. This value can be spoofed. |
sourceOrigin | optional<string> | The origin of the network request, with the value verified through the TCP stack. |
origin | optional<string> | The best-effort identifier of the originating machine. For example, an IP address, a Kubernetes node identifier, or similar. This value can be spoofed. |
orgId | optional<string> | The organization to which the uid belongs, if available. |
userAgent | optional<string> | The user agent of the user that originated this log. |
uid | optional<UserId> | The user ID, if available. This is the most downstream caller. |
sid | optional<SessionId> | The session ID, if available. |
eventId | uuid | The unique identifier for an auditable event. This can be used to group log lines that are part of the same event. For example, the same eventId will be logged in lines emitted at the start and end of a large binary response streamed to the consumer. |
logEntryId | uuid | The unique identifier for this audit log line, not repeated across any other log line in the system. Note that some log lines may be duplicated during ingestion into Foundry, and there may be several rows with the same logEntryId . Rows with the same logEntryId are duplicates and can be ignored. |
sequenceId | uuid | A best-effort ordering field for events that share the same eventId . |
traceId | optional<TraceId> | The Zipkin trace ID, if available. |