class PredictionService[T] extends AnyRef
Thread-safe Prediction Service for Concurrent Calls
In this service, concurrency is kept not greater than numThreads by a BlockingQueue,
which contains available model instances.
numThreads model instances sharing weights/bias
will be put into the BlockingQueue during initialization.
When predict method called, service will try to take an instance from BlockingQueue,
which means if all instances are on serving, the predicting request will be blocked until
some instances are released.
If exceptions caught during predict,
a scalar Tensor[String] will be returned with thrown message.
- Alphabetic
- By Inheritance
- PredictionService
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
val
instQueue: LinkedBlockingQueue[Module[T]]
- Attributes
- protected
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
predict(request: Array[Byte]): Array[Byte]
Firstly, deserialization tasks will be run with inputs(Array[Byte]).
Thread-safe single sample prediction
Firstly, deserialization tasks will be run with inputs(Array[Byte]).
Then, run model prediction with deserialized inputs as soon as there exists vacant instances(total number is numThreads). Otherwise, it will hold on till some instances are released.
Finally, prediction results will be serialized to Array[Byte] according to BigDL.proto.- request
input bytes, which will be deserialized by BigDL.proto
- returns
output bytes, which is serialized by BigDl.proto
-
def
predict(request: Activity): Activity
Running model prediction with input Activity as soon as there exists vacant instances(the size of pool is numThreads).
Thread-safe single sample prediction
Running model prediction with input Activity as soon as there exists vacant instances(the size of pool is numThreads). Otherwise, it will hold on till some instances are released.
Outputs will be deeply copied after model prediction, so they are invariant.- request
input Activity, could be Tensor or Table(key, Tensor)
- returns
output Activity, could be Tensor or Table(key, Tensor)
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @native() @throws( ... )