public final class ModelManager
extends java.lang.Object
| Modifier and Type | Method and Description |
|---|---|
boolean |
addJob(Job job)
Adds an inference job to the job queue.
|
DescribeModelResponse |
describeModel(java.lang.String modelName,
java.lang.String version)
Returns a list of worker information for specified model.
|
java.util.Map<java.lang.String,Endpoint> |
getEndpoints()
Returns the registry of all endpoints.
|
static ModelManager |
getInstance()
Returns the singleton
ModelManager instance. |
ModelInfo |
getModel(java.lang.String modelName,
java.lang.String version,
boolean predict)
Returns a version of model.
|
java.util.Set<java.lang.String> |
getStartupModels()
Returns a set of models that was loaded at startup.
|
static void |
init(ConfigManager configManager)
Initialized the global
ModelManager instance. |
java.util.concurrent.CompletableFuture<ModelInfo> |
registerModel(java.lang.String modelName,
java.lang.String version,
java.lang.String modelUrl,
java.lang.String engineName,
int gpuId,
int batchSize,
int maxBatchDelay,
int maxIdleTime)
Registers and loads a model.
|
ModelInfo |
triggerModelUpdated(ModelInfo modelInfo)
trigger that a ModelInfo has been updated.
|
boolean |
unregisterModel(java.lang.String modelName,
java.lang.String version)
Unregisters a model by its name and version.
|
java.util.concurrent.CompletableFuture<java.lang.String> |
workerStatus()
Sends model server health status to client.
|
public static void init(ConfigManager configManager)
ModelManager instance.configManager - the configurationpublic static ModelManager getInstance()
ModelManager instance.ModelManager instancepublic java.util.concurrent.CompletableFuture<ModelInfo> registerModel(java.lang.String modelName, java.lang.String version, java.lang.String modelUrl, java.lang.String engineName, int gpuId, int batchSize, int maxBatchDelay, int maxIdleTime)
modelName - the name of the model for HTTP endpointversion - the model versionmodelUrl - the model urlengineName - the engine to load the modelgpuId - the GPU device id, -1 for auto selectionbatchSize - the batch sizemaxBatchDelay - the maximum delay for batchingmaxIdleTime - the maximum idle time of the worker threads before scaling down.CompletableFuture instancepublic boolean unregisterModel(java.lang.String modelName,
java.lang.String version)
modelName - the model name to be unregisteredversion - the model versiontrue if unregister successpublic ModelInfo triggerModelUpdated(ModelInfo modelInfo)
modelInfo - the model that has been updatedpublic java.util.Map<java.lang.String,Endpoint> getEndpoints()
public ModelInfo getModel(java.lang.String modelName, java.lang.String version, boolean predict)
modelName - the model nameversion - the model versionpredict - ture for selecting a model in load balance fashionpublic java.util.Set<java.lang.String> getStartupModels()
public boolean addJob(Job job) throws ai.djl.repository.zoo.ModelNotFoundException
job - an inference job to be executedtrue if submit successai.djl.repository.zoo.ModelNotFoundException - if the model is not registeredpublic DescribeModelResponse describeModel(java.lang.String modelName, java.lang.String version) throws ai.djl.repository.zoo.ModelNotFoundException
modelName - the model name to be queriedversion - the model version to be queriedai.djl.repository.zoo.ModelNotFoundException - if specified model not foundpublic java.util.concurrent.CompletableFuture<java.lang.String> workerStatus()