Version 2.0

Create Live Deployment

POST/api/v2/models/liveDeployments

Warning

This endpoint is in preview and may be modified or removed at any time. To use this endpoint, add preview=true to the request query parameters.

Creates a new live deployment for a model version with the specified runtime configuration. The deployment will begin provisioning compute resources and deploying the target model version.

Third-party applications using this endpoint via OAuth2 must request the following operation scope: api:models-write.

Query parameters

preview

booleanoptional

Enables the use of preview functionality.

Request body

CreateLiveDeploymentRequest

object

Hide child attributes

deploymentType

union

The target model source for the live deployment. Determines which model and version selection strategy to use when creating the deployment.

Show child attributes

runtimeConfiguration

object

The compute resource configuration for the deployment.

Show child attributes

Response body

LiveDeployment

object

The created LiveDeployment

Hide child attributes

rid

string

The Resource Identifier (RID) of a Live Deployment.

modelVersion

object

The currently deployed model version.

Show child attributes

branch

stringoptional

The model branch this deployment tracks. Present for direct deployments that follow the latest model version on a branch; absent for deployment types that are not branch-scoped.

runtimeConfiguration

object

The compute resource configuration for the deployment.

Show child attributes

status

object

The current operational status of the deployment.

Show child attributes

Examples

Request

Copied!

curl -X POST \
\t-H "Content-type: application/json" \
\t-H "Authorization: Bearer $TOKEN" \
	"https://$HOSTNAME/api/v2/models/liveDeployments?preview=true" \
	-d '{"runtimeConfiguration":{"minReplicas":1,"maxReplicas":3,"cpu":1.0,"memory":"256MiB","threadCount":32,"environmentVariables":{"LOG_LEVEL":"INFO"}}}'

Response

Copied!

{
  "runtimeConfiguration": {
    "minReplicas": 1,
    "maxReplicas": 3,
    "cpu": 1,
    "memory": "256MiB",
    "threadCount": 32,
    "environmentVariables": {
      "LOG_LEVEL": "INFO"
    }
  },
  "modelVersion": {
    "modelRid": "ri.models.main.model.f351c142-0e4c-4b12-adc2-6e1539737ae9",
    "modelVersionRid": "ri.models.main.model-version.adf94926-c3ac-41ea-beb2-4946699d08ee"
  },
  "rid": "ri.foundry-ml-live.main.live-deployment.f351c142-0e4c-4b12-adc2-6e1539737ae9",
  "branch": "master",
  "status": {
    "state": "ACTIVE",
    "isReady": true
  }
}

Error responses

Error Name
`ThreadCountTooHigh`	Error Code	`INVALID_ARGUMENT`
	Status Code	400
	Description	The specified thread count exceeds the maximum allowed value.
	Parameters	`maxThreadCount, providedThreadCount`
`InvalidGpuCount`	Error Code	`INVALID_ARGUMENT`
	Status Code	400
	Description	The GPU count is invalid. The GPU count must be between 1 and the maximum allowed for the requested GPU type.
	Parameters	`providedGpuCount, maxGpuCount`
`GpuTypeNotAvailable`	Error Code	`INVALID_ARGUMENT`
	Status Code	400
	Description	The requested GPU type is not available. Use a GPU type that is available in the deployment's resource queue.
	Parameters	`requestedGpuType, availableGpuTypes`
`CreateLiveDeploymentPermissionDenied`	Error Code	`PERMISSION_DENIED`
	Status Code	403
	Description	Could not create the LiveDeployment.
	Parameters
`ModelNotFound`	Error Code	`NOT_FOUND`
	Status Code	404
	Description	The given Model could not be found.
	Parameters	`modelRid`

See Errors for a general overview of errors in the platform.