To connect to systems within your network that cannot accept inbound network traffic from Foundry, you can use an intermediary agent either as an agent worker or agent proxy. This page describes the configuration options available for the agent worker runtime and assumes you are already familiar with data connection agents.
You should only use an agent worker runtime if your target system cannot accept inbound network traffic from Foundry, and the connector does not support the agent proxy runtime.
You must set up an agent to use the agent worker runtime. When configuring a source, you can select the Agent worker option and choose at least one available agent. Capabilities run over that source connection will be executed as jobs on the agent worker host.
When using an agent worker runtime, capabilities are executed by running Java code on the agent host directly. These Java processes may pull data from your systems and push data up to Foundry (as with batch syncs), or pull data from Foundry and push to your systems (as with exports).
This execution model comes with some downsides including:
Agent memory is one of the key factors determining the performance of capabilities executed using an agent worker runtime.
The primary settings available for agent memory are:
Actual memory usage observed on the host will vary based on the workload currently being executed by the agent worker, including other processes running on the same host.
When observing and monitoring memory usage for an agent used as an agent worker, there are two primary metrics:
For information on monitoring memory usage on agents, review agent metrics and health monitoring.
When using an agent worker runtime, multiple agents may be assigned to a single source connection. An agent is assigned jobs to execute specific capabilities configured on assigned source connections. Jobs are executed on one of the available agents at the time when the job is started.
Jobs are assigned to the agent with the largest available bandwidth. The bandwidth is calculated as:
(Maximum concurrent syncs) - (currently running batch syncs) = bandwidth
Maximum concurrent syncs defaults to 16
, and is configurable under Agent settings. The Maximum concurrent syncs quota is enforced across all capabilities and all assigned sources, meaning that any run of any capability on any source uses up one unit of the available concurrent sync quota. This also includes legacy data connection tasks. If the bandwidth is zero on all available agents, or if the assigned agent has positive bandwidth but more than the maximum concurrent syncs currently running, jobs will be queued.
Only batch syncs are considered when calculating bandwidth. This means that other capabilities running on the agent will be ignored for the purposes of allocating additional jobs. If your primary agent workloads are streaming syncs, change data capture syncs, exports, or other capabilities, you may see unexpected behavior when allocating jobs in a multi-agent setup.
There are no guarantees that jobs will be distributed evenly across multiple available agents with the same bandwidth value.
In general, we do not recommend using multiple agents as a way to load balance a larger workload than could be successfully run on a single agent. The primary intended use of multiple agents is to allow for agents being taken offline for maintenance. For optimal performance and reliability, we recommend that each agent in a multi-agent setup should be able to handle the full set of capabilities configured on the assigned source connection(s).
Agent worker runtimes support two options to specify how data from batch syncs should be uploaded to the Palantir platform:
In data proxy mode, data is uploaded using the public Foundry API using the data proxy service. This uses the same API gateway that is used when calling the Foundry API to read and write datasets.
Agents configured to use data proxy mode will contain the following in the agent configuration YAML:
Copied!1 2 3 4
destinations: foundry: uploadStrategy: type: data-proxy
Direct mode is not available on new agents or on enrollments set up after June 2024. Data proxy mode is the default and only option supported for new agents. Agents previously configured to use direct mode will continue to be supported as long as the public IPs of the host where the agent is installed do not change.
In direct mode, data is uploaded directly to the underlying storage buckets in the Foundry data catalog. While providing performance improvements, this is only possible with custom network configuration by Palantir support, and is not available on our latest cloud infrastructure.
Agents configured to use direct mode will contain the following in the agent configuration YAML:
Copied!1 2 3 4
destinations: foundry: uploadStrategy: type: direct
Information on how to add custom JDBC drivers to an agent can be found in the documentation for the JDBC (Custom) connector. Drivers must be signed by Palantir and added directly to the agent to work with the agent worker runtime.
For agent proxy and direct connection runtimes, custom drivers are added directly in the JDBC (Custom) connector user interface as explained in our documentation. These drivers do not need to be signed by Palantir.
One unique aspect of the agent worker runtime is that credentials are never stored in Foundry. Instead, at the time when credentials are input in the Data Connection user interface, they are encrypted with the public key of each agent assigned to the source. The encrypted credentials are stored on each respective agent.
This means that the following caveats and restrictions apply to credential configuration when using the agent worker runtime:
More information on moving agents between directories and hosts is covered in the agent configuration reference documentation, including instructions for retaining encrypted credentials when moving an existing agent directory.
Agents communicate with both Foundry and your internal network. This means that agents need to have the correct certificates in their truststores for these connections to be established.
There are two situations that may require additional certificates to be configured on an agent:
Certificate requirements for agents to communicate with Foundry are covered in the agent configuration documentation and are required whether the agent will be used as an agent proxy or agent worker.
When an agent is used as an agent worker, additional certificates may be required for Java processes running on the agent to successfully communicate with your systems. New certificates may need to be added for each new source connection, and these certificates should be updated if they expire or are rotated.
When required certificates are missing, you will see errors like the following when attempting to use a source capability such as exploration:
Wrapped by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException:
PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException:
unable to find valid certification path to requested target
Follow these instructions to add additional certificates for connecting to specific source systems.
If an agent-based source reaches out to a system that is accessible over the Internet, it should be migrated to a direct connection runtime. Follow the steps below to perform this migration.
To create new egress policies, you must have access to the workflow titled Manage network egress configuration
in Control Panel, which is granted to the Information Security Officer
role.
This section describes situations that may occur during the migration, as well as suggested resolution steps. The migration is reversible.
Could not resolve type id as a subtype of 'com.palantir.magritte.api.Source'
Suggested resolution:
This happens when a dependency required for the source cannot be found. Ensure that you have configured all required certificates, proxies, and drivers required for a particular source, then retry the migration.
UnknownHostException
Suggested resolution:
Driver class not found
Suggested resolution: Confirm that the correct driver is uploaded to the JDBC source.
PKIX path building failed
Suggested resolution: Ensure that correct certificates are added to the source.
If the system you are connecting to requires mutual TLS (mTLS), you must manually add a private key to the agent.
The default bootstrapper keystore and truststore are regenerated any time the agent is restarted, and any changes made to the default keystore will be overridden on restart. The below instructions explain how to override the default keystore to point at a custom keystore in a different location on the agent host, and how to modify this custom keystore to add your private key.
Copy the default bootstrapper keystore and store it in a separate location on the agent host. Run the following commands with the same username that is running the agent on the host. You may choose to name the folder security
or according to your preferences.
Copied!1 2
$ mkdir /home/<username>/security $ cp <bootvisor_root>/var/data/processes/<bootstrapper_dir>/var/conf/keyStore.jks /home/<username>/security/
Import the keys from the customer-provided keystore into the copied agent keystore using the Java keytool
command line tool. If this tool is not already installed, find it in the bin
directory of the JDK that is bundled with the agent.
Copied!1 2 3 4
$ keytool -importkeystore -srckeystore <CUSTOM_KEYSTORE.jks> -destkeystore /home/<username>/security/keyStore.jks Importing keystore CUSTOM_KEYSTORE.jks to keyStore.jks... Enter destination keystore password: keystore Enter source keystore password:
You can verify that the key/keys were added to the copied keystore using the keytool -list
command:
Copied!1 2 3 4 5 6 7 8 9 10 11
$ keytool -list -keystore /home/<username>/security/keyStore.jks Enter keystore password: Keystore type: jks Keystore provider: SUN Your keystore contains 2 entries <CUSTOM_KEY>>, 15-Dec-2022, PrivateKeyEntry, Certificate fingerprint (SHA-256): A5:B5:2F:1B:39:D3:DA:47:8B:6E:6A:DA:72:4B:0B:43:C7:2C:89:CD:0D:9D:03:B2:3F:35:7A:D4:7C:D3:3D:51 server, 15-Dec-2022, PrivateKeyEntry, Certificate fingerprint (SHA-256): DB:82:66:E8:09:43:30:9D:EF:0A:41:63:72:0C:2A:8D:F0:8A:C1:25:F7:89:B1:A3:6E:6F:C6:C5:2C:17:CB:B2
Use the keytool -keypasswd
command to update the imported key password. The agent keystore requires that both the key and keystore passwords match.
Copied!1 2
$ keytool -keypasswd -alias <CUSTOM_KEY> -new keystore -keystore /home/<username>/security/keyStore.jks Enter keystore password:
In Data Connection, navigate to the agent, then open the Agent settings tab. In the Manage Configuration section, select Advanced, choose the Agent tab, and update the keyStore
to point to the newly copied keystore. Then, add keyStorePassword
and set it to the appropriate value (keystore
, by default).
Copied!1 2 3 4 5
security: keyStore: /home/<username>/security/keyStore.jks keyStorePassword: keystore trustStore: var/conf/trustStore.jks ...
Finally, choose the Explorer tab and update both thekeyStorePath
and keyStorePassword
. Save the new configuration.
Copied!1 2 3 4 5
security: keyStorePath: /home/<username>/security/keyStore.jks keyStorePassword: keystore trustStorePath: var/conf/trustStore.jks ...
Restart the agent.
Note that the field is named keyStore
when configuring in the Agent tab and keyStorePath
in the Explorer tab. No changes are required to the Bootstrapper configuration.