public static interface ScalingPolicyMetric.Builder extends SdkPojo, CopyableBuilder<ScalingPolicyMetric.Builder,ScalingPolicyMetric>
| Modifier and Type | Method and Description |
|---|---|
ScalingPolicyMetric.Builder |
invocationsPerInstance(Integer invocationsPerInstance)
The number of invocations sent to a model, normalized by
InstanceCount in each
ProductionVariant. |
ScalingPolicyMetric.Builder |
modelLatency(Integer modelLatency)
The interval of time taken by a model to respond as viewed from SageMaker.
|
equalsBySdkFields, sdkFieldscopyapplyMutation, buildScalingPolicyMetric.Builder invocationsPerInstance(Integer invocationsPerInstance)
The number of invocations sent to a model, normalized by InstanceCount in each
ProductionVariant. 1/numberOfInstances is sent as the value on each request, where
numberOfInstances is the number of active instances for the ProductionVariant behind the
endpoint at the time of the request.
invocationsPerInstance - The number of invocations sent to a model, normalized by InstanceCount in each
ProductionVariant. 1/numberOfInstances is sent as the value on each request, where
numberOfInstances is the number of active instances for the ProductionVariant behind the
endpoint at the time of the request.ScalingPolicyMetric.Builder modelLatency(Integer modelLatency)
The interval of time taken by a model to respond as viewed from SageMaker. This interval includes the local communication times taken to send the request and to fetch the response from the container of a model and the time taken to complete the inference in the container.
modelLatency - The interval of time taken by a model to respond as viewed from SageMaker. This interval includes the
local communication times taken to send the request and to fetch the response from the container of a
model and the time taken to complete the inference in the container.Copyright © 2023. All rights reserved.