Below are some frequently asked questions about Modeling Objective live deployments, which are distinct from direct deployments configured from the model page. Learn more about creating and setting up a live deployment and the differences between live and direct deployments.
Models will be packaged with the conda packages/environment configured in the model adapter of the published model. Palantir will also add necessary lightweight packages to serve your production model.
By default, live accepts up to 50MB in a single request. This upper limit is configurable; contact your Palantir representative for more details.
Yes. Modeling Objective live deployment uptime can be monitored through the Data Health application's monitoring view.
We support the use of private and public libraries to be imported into a submission environment.
Yes. The ability to create live deployments can be permissioned separately from the ability to create batch deployments. Contact your Palantir representative for guidance.
Yes, Foundry currently provides traffic scaling when deployed within Palantir's container infrastructure.
By default, each deployment is configured with 2 replicas, ensuring there is no downtime during upgrades. The default CPU and memory footprint is also low, resulting in a low default cost profile.
This can be overridden for individual deployments, to support larger models or higher expected load. Additionally, the default profile can be overridden for all live deployments via Control Panel.
Yes. GPU support for Python models is in the beta phase of development and may not be available on your enrollment. Functionality may change during active development.
Yes, you can disable your live deployment via the individual deployment page.
When you are ready to start using it again, you can re-enable it via the UI as well. Note that the deployment will upgrade to latest release once it has been re-enabled.
Alternatively, you can also delete a deployment; this action cannot be reversed however, and you will no longer maintain the same Target RID.
Yes. However, you (or an authorized administrator) must configure network egress for live deployments.
A five minute timeout will occur due to the default dialog read timeout, as running inference is a synchronous process.