@Generated public interface ServingEndpointsService
You can use a serving endpoint to serve models from the Databricks Model Registry or from Unity Catalog. Endpoints expose the underlying models as scalable REST API endpoints using serverless compute. This means the endpoints and associated compute resources are fully managed by Databricks and will not appear in your cloud account. A serving endpoint can consist of one or more MLflow models from the Databricks Model Registry, called served entities. A serving endpoint can have at most ten served entities. You can configure traffic settings to define how requests should be routed to your served entities behind an endpoint. Additionally, you can configure the scale of resources that should be applied to each served entity.
This is the high-level interface, that contains generated methods.
Evolving: this interface is under development. Method signatures may change.
| Modifier and Type | Method and Description |
|---|---|
BuildLogsResponse |
buildLogs(BuildLogsRequest buildLogsRequest)
Get build logs for a served model.
|
ServingEndpointDetailed |
create(CreateServingEndpoint createServingEndpoint)
Create a new serving endpoint.
|
void |
delete(DeleteServingEndpointRequest deleteServingEndpointRequest)
Delete a serving endpoint.
|
ExportMetricsResponse |
exportMetrics(ExportMetricsRequest exportMetricsRequest)
Get metrics of a serving endpoint.
|
ServingEndpointDetailed |
get(GetServingEndpointRequest getServingEndpointRequest)
Get a single serving endpoint.
|
void |
getOpenApi(GetOpenApiRequest getOpenApiRequest)
Get the schema for a serving endpoint.
|
GetServingEndpointPermissionLevelsResponse |
getPermissionLevels(GetServingEndpointPermissionLevelsRequest getServingEndpointPermissionLevelsRequest)
Get serving endpoint permission levels.
|
ServingEndpointPermissions |
getPermissions(GetServingEndpointPermissionsRequest getServingEndpointPermissionsRequest)
Get serving endpoint permissions.
|
ListEndpointsResponse |
list()
Get all serving endpoints.
|
ServerLogsResponse |
logs(LogsRequest logsRequest)
Get the latest logs for a served model.
|
Collection<EndpointTag> |
patch(PatchServingEndpointTags patchServingEndpointTags)
Update tags of a serving endpoint.
|
PutResponse |
put(PutRequest putRequest)
Update rate limits of a serving endpoint.
|
QueryEndpointResponse |
query(QueryEndpointInput queryEndpointInput)
Query a serving endpoint.
|
ServingEndpointPermissions |
setPermissions(ServingEndpointPermissionsRequest servingEndpointPermissionsRequest)
Set serving endpoint permissions.
|
ServingEndpointDetailed |
updateConfig(EndpointCoreConfigInput endpointCoreConfigInput)
Update config of a serving endpoint.
|
ServingEndpointPermissions |
updatePermissions(ServingEndpointPermissionsRequest servingEndpointPermissionsRequest)
Update serving endpoint permissions.
|
BuildLogsResponse buildLogs(BuildLogsRequest buildLogsRequest)
Retrieves the build logs associated with the provided served model.
ServingEndpointDetailed create(CreateServingEndpoint createServingEndpoint)
void delete(DeleteServingEndpointRequest deleteServingEndpointRequest)
ExportMetricsResponse exportMetrics(ExportMetricsRequest exportMetricsRequest)
Retrieves the metrics associated with the provided serving endpoint in either Prometheus or OpenMetrics exposition format.
ServingEndpointDetailed get(GetServingEndpointRequest getServingEndpointRequest)
Retrieves the details for a single serving endpoint.
void getOpenApi(GetOpenApiRequest getOpenApiRequest)
Get the query schema of the serving endpoint in OpenAPI format. The schema contains information for the supported paths, input and output format and datatypes.
GetServingEndpointPermissionLevelsResponse getPermissionLevels(GetServingEndpointPermissionLevelsRequest getServingEndpointPermissionLevelsRequest)
Gets the permission levels that a user can have on an object.
ServingEndpointPermissions getPermissions(GetServingEndpointPermissionsRequest getServingEndpointPermissionsRequest)
Gets the permissions of a serving endpoint. Serving endpoints can inherit permissions from their root object.
ListEndpointsResponse list()
ServerLogsResponse logs(LogsRequest logsRequest)
Retrieves the service logs associated with the provided served model.
Collection<EndpointTag> patch(PatchServingEndpointTags patchServingEndpointTags)
Used to batch add and delete tags from a serving endpoint with a single API call.
PutResponse put(PutRequest putRequest)
Used to update the rate limits of a serving endpoint. NOTE: only external and foundation model endpoints are supported as of now.
QueryEndpointResponse query(QueryEndpointInput queryEndpointInput)
ServingEndpointPermissions setPermissions(ServingEndpointPermissionsRequest servingEndpointPermissionsRequest)
Sets permissions on a serving endpoint. Serving endpoints can inherit permissions from their root object.
ServingEndpointDetailed updateConfig(EndpointCoreConfigInput endpointCoreConfigInput)
Updates any combination of the serving endpoint's served entities, the compute configuration of those served entities, and the endpoint's traffic config. An endpoint that already has an update in progress can not be updated until the current update completes or fails.
ServingEndpointPermissions updatePermissions(ServingEndpointPermissionsRequest servingEndpointPermissionsRequest)
Updates the permissions on a serving endpoint. Serving endpoints can inherit permissions from their root object.
Copyright © 2024. All rights reserved.