Search documentation
karat

+

K

User Documentation ↗
Version 2.0

Create Live Deployment

POST/api/v2/models/liveDeployments
Warning

This endpoint is in preview and may be modified or removed at any time. To use this endpoint, add preview=true to the request query parameters.

Creates a new live deployment for a model version with the specified runtime configuration. The deployment will begin provisioning compute resources and deploying the target model version.

Third-party applications using this endpoint via OAuth2 must request the following operation scope: api:models-write.

Query parameters

preview
booleanoptional

Enables the use of preview functionality.

Request body

CreateLiveDeploymentRequest
object
Hide child attributes

Hide child attributes

deploymentType
union

The target model source for the live deployment. Determines which model and version selection strategy to use when creating the deployment.

Show child attributes

Show child attributes

runtimeConfiguration
object

The compute resource configuration for the deployment.

Show child attributes

Show child attributes

Response body

LiveDeployment
object

The created LiveDeployment

Hide child attributes

Hide child attributes

rid
string

The Resource Identifier (RID) of a Live Deployment.

modelVersion
object

The currently deployed model version.

Show child attributes

Show child attributes

branch
stringoptional

The model branch this deployment tracks. Present for direct deployments that follow the latest model version on a branch; absent for deployment types that are not branch-scoped.

runtimeConfiguration
object

The compute resource configuration for the deployment.

Show child attributes

Show child attributes

status
object

The current operational status of the deployment.

Show child attributes

Show child attributes

Examples

Request

Copied!
1 2 3 4 5 curl -X POST \ \t-H "Content-type: application/json" \ \t-H "Authorization: Bearer $TOKEN" \ "https://$HOSTNAME/api/v2/models/liveDeployments?preview=true" \ -d '{"runtimeConfiguration":{"minReplicas":1,"maxReplicas":3,"cpu":1.0,"memory":"256MiB","threadCount":32}}'

Response

Copied!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 { "runtimeConfiguration": { "minReplicas": 1, "maxReplicas": 3, "cpu": 1, "memory": "256MiB", "threadCount": 32 }, "modelVersion": { "modelRid": "ri.models.main.model.f351c142-0e4c-4b12-adc2-6e1539737ae9", "modelVersionRid": "ri.models.main.model-version.adf94926-c3ac-41ea-beb2-4946699d08ee" }, "rid": "ri.foundry-ml-live.main.live-deployment.f351c142-0e4c-4b12-adc2-6e1539737ae9", "branch": "master", "status": { "state": "ACTIVE", "isReady": true } }

Error responses

Error Name
ThreadCountTooHighError CodeINVALID_ARGUMENT
Status Code400
DescriptionThe specified thread count exceeds the maximum allowed value.
ParametersmaxThreadCount, providedThreadCount
InvalidGpuCountError CodeINVALID_ARGUMENT
Status Code400
DescriptionThe GPU count is invalid. The GPU count must be between 1 and the maximum allowed for the requested GPU type.
ParametersprovidedGpuCount, maxGpuCount
GpuTypeNotAvailableError CodeINVALID_ARGUMENT
Status Code400
DescriptionThe requested GPU type is not available. Use a GPU type that is available in the deployment's resource queue.
ParametersrequestedGpuType, availableGpuTypes
CreateLiveDeploymentPermissionDeniedError CodePERMISSION_DENIED
Status Code403
DescriptionCould not create the LiveDeployment.
Parameters
ModelNotFoundError CodeNOT_FOUND
Status Code404
DescriptionThe given Model could not be found.
ParametersmodelRid