package nn
- Alphabetic
- Public
- All
Type Members
-
class
Abs[T] extends TensorModule[T]
an element-wise abs operation
an element-wise abs operation
- Annotations
- @SerialVersionUID()
-
class
AbsCriterion[T] extends TensorCriterion[T]
measures the mean absolute value of the element-wise difference between input and target
measures the mean absolute value of the element-wise difference between input and target
- Annotations
- @SerialVersionUID()
- class ActivityRegularization[T] extends TensorModule[T]
-
class
Add[T] extends TensorModule[T] with Initializable
adds a bias term to input data ;
adds a bias term to input data ;
- Annotations
- @SerialVersionUID()
-
class
AddConstant[T] extends TensorModule[T]
adding a constant
adding a constant
- Annotations
- @SerialVersionUID()
-
class
Anchor extends Serializable
Generates a regular grid of multi-scale, multi-aspect anchor boxes.
-
class
Attention[T] extends AbstractModule[Activity, Activity, T]
Implementation of multiheaded attention and self-attention layers.
-
class
BCECriterion[T] extends TensorCriterion[T]
This loss function measures the Binary Cross Entropy between the target and the output loss(o, t) = - 1/n sum_i (t[i] * log(o[i]) + (1 - t[i]) * log(1 - o[i])) or in the case of the weights argument being specified: loss(o, t) = - 1/n sum_i weights[i] * (t[i] * log(o[i]) + (1 - t[i]) * log(1 - o[i]))
This loss function measures the Binary Cross Entropy between the target and the output loss(o, t) = - 1/n sum_i (t[i] * log(o[i]) + (1 - t[i]) * log(1 - o[i])) or in the case of the weights argument being specified: loss(o, t) = - 1/n sum_i weights[i] * (t[i] * log(o[i]) + (1 - t[i]) * log(1 - o[i]))
By default, the losses are averaged for each mini-batch over observations as well as over dimensions. However, if the field sizeAverage is set to false, the losses are instead summed.
- T
numeric type
- Annotations
- @SerialVersionUID()
- case class BatchNormParams[T](eps: Double = 1e-5, momentum: Double = 0.1, initWeight: Tensor[T] = null, initBias: Tensor[T] = null, initGradWeight: Tensor[T] = null, initGradBias: Tensor[T] = null, affine: Boolean = true)(implicit evidence$8: ClassTag[T], ev: TensorNumeric[T]) extends Product with Serializable
-
class
BatchNormalization[T] extends TensorModule[T] with Initializable with MklInt8Convertible
This layer implements Batch Normalization as described in the paper: "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" by Sergey Ioffe, Christian Szegedy https://arxiv.org/abs/1502.03167
This layer implements Batch Normalization as described in the paper: "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" by Sergey Ioffe, Christian Szegedy https://arxiv.org/abs/1502.03167
This implementation is useful for inputs NOT coming from convolution layers. For convolution layers, use nn.SpatialBatchNormalization.
The operation implemented is: ( x - mean(x) ) y = -------------------- * gamma + beta standard-deviation(x) where gamma and beta are learnable parameters.The learning of gamma and beta is optional.
- T
numeric type
- Annotations
- @SerialVersionUID()
-
class
BiRecurrent[T] extends DynamicContainer[Tensor[T], Tensor[T], T]
This layer implement a bidirectional recurrent neural network
This layer implement a bidirectional recurrent neural network
- T
numeric type
-
class
BifurcateSplitTable[T] extends AbstractModule[Tensor[T], Table, T]
Creates a module that takes a Tensor as input and outputs two tables, splitting the Tensor along the specified dimension
dimension.Creates a module that takes a Tensor as input and outputs two tables, splitting the Tensor along the specified dimension
dimension.The input to this layer is expected to be a tensor, or a batch of tensors;
- T
Numeric type. Only support float/double now
-
class
Bilinear[T] extends AbstractModule[Table, Tensor[T], T] with Initializable
a bilinear transformation with sparse inputs, The input tensor given in forward(input) is a table containing both inputs x_1 and x_2, which are tensors of size N x inputDimension1 and N x inputDimension2, respectively.
a bilinear transformation with sparse inputs, The input tensor given in forward(input) is a table containing both inputs x_1 and x_2, which are tensors of size N x inputDimension1 and N x inputDimension2, respectively.
- Annotations
- @SerialVersionUID()
-
class
BinaryThreshold[T] extends TensorModule[T]
Threshold input Tensor.
Threshold input Tensor. If values in the Tensor smaller than th, then replace it with v
- Annotations
- @SerialVersionUID()
-
class
BinaryTreeLSTM[T] extends TreeLSTM[T]
This class is an implementation of Binary TreeLSTM (Constituency Tree LSTM).
-
class
Bottle[T] extends DynamicContainer[Tensor[T], Tensor[T], T]
Bottle allows varying dimensionality input to be forwarded through any module that accepts input of nInputDim dimensions, and generates output of nOutputDim dimensions.
Bottle allows varying dimensionality input to be forwarded through any module that accepts input of nInputDim dimensions, and generates output of nOutputDim dimensions.
- Annotations
- @SerialVersionUID()
- class BoxHead extends BaseModule[Float]
-
class
CAdd[T] extends TensorModule[T] with Initializable
This layer has a bias tensor with given size.
This layer has a bias tensor with given size. The bias will be added element wise to the input tensor. If the element number of the bias tensor match the input tensor, a simply element wise will be done. Or the bias will be expanded to the same size of the input. The expand means repeat on unmatched singleton dimension(if some unmatched dimension isn't singleton dimension, it will report an error). If the input is a batch, a singleton dimension will be add to the first dimension before the expand.
- T
numeric type
- Annotations
- @SerialVersionUID()
-
class
CAddTable[T, D] extends AbstractModule[Table, Tensor[D], T] with MklInt8Convertible
Merge the input tensors in the input table by element wise adding them together.
Merge the input tensors in the input table by element wise adding them together. The input table is actually an array of tensor with same size.
- T
Numeric type. Only support float/double now
- Annotations
- @SerialVersionUID()
-
class
CAveTable[T] extends AbstractModule[Table, Tensor[T], T]
Merge the input tensors in the input table by element wise taking the average.
Merge the input tensors in the input table by element wise taking the average. The input table is actually an array of tensor with same size.
- T
Numeric type. Only support float/double now
- Annotations
- @SerialVersionUID()
-
class
CDivTable[T] extends AbstractModule[Table, Tensor[_], T]
Takes a table with two Tensor and returns the component-wise division between them.
Takes a table with two Tensor and returns the component-wise division between them.
- Annotations
- @SerialVersionUID()
-
class
CMaxTable[T] extends AbstractModule[Table, Tensor[T], T]
Takes a table of Tensors and outputs the max of all of them.
Takes a table of Tensors and outputs the max of all of them.
- Annotations
- @SerialVersionUID()
-
class
CMinTable[T] extends AbstractModule[Table, Tensor[T], T]
Takes a table of Tensors and outputs the min of all of them.
Takes a table of Tensors and outputs the min of all of them.
- Annotations
- @SerialVersionUID()
-
class
CMul[T] extends TensorModule[T] with Initializable
This layer has a weight tensor with given size.
This layer has a weight tensor with given size. The weight will be multiplied element wise to the input tensor. If the element number of the weight tensor match the input tensor, a simply element wise multiply will be done. Or the bias will be expanded to the same size of the input. The expand means repeat on unmatched singleton dimension(if some unmatched dimension isn't singleton dimension, it will report an error). If the input is a batch, a singleton dimension will be add to the first dimension before the expand.
- T
numeric type
- Annotations
- @SerialVersionUID()
-
class
CMulTable[T] extends AbstractModule[Table, Tensor[T], T]
Takes a table of Tensors and outputs the multiplication of all of them.
Takes a table of Tensors and outputs the multiplication of all of them.
- Annotations
- @SerialVersionUID()
-
class
CSubTable[T] extends AbstractModule[Table, Tensor[_], T]
Takes a table with two Tensor and returns the component-wise subtraction between them.
Takes a table with two Tensor and returns the component-wise subtraction between them.
- Annotations
- @SerialVersionUID()
-
class
CategoricalCrossEntropy[T] extends AbstractCriterion[Tensor[T], Tensor[T], T]
This is same with cross entropy criterion, except the target tensor is a one-hot tensor
This is same with cross entropy criterion, except the target tensor is a one-hot tensor
- T
The numeric type in the criterion, usually which are Float or Double
-
abstract
class
Cell[T] extends AbstractModule[Table, Table, T]
The Cell class is a super class of any recurrent kernels, such as RnnCell, LSTM and GRU.
-
class
Clamp[T] extends HardTanh[T]
A kind of hard tanh activition function with integer min and max
A kind of hard tanh activition function with integer min and max
- T
numeric type
- Annotations
- @SerialVersionUID()
-
class
ClassNLLCriterion[T] extends TensorCriterion[T]
The negative log likelihood criterion.
The negative log likelihood criterion. It is useful to train a classification problem with n classes. If provided, the optional argument weights should be a 1D Tensor assigning weight to each of the classes. This is particularly useful when you have an unbalanced training set.
The input given through a forward() is expected to contain log-probabilities/probabilities of each class: input has to be a 1D Tensor of size n. Obtaining log-probabilities/probabilities in a neural network is easily achieved by adding a LogSoftMax/SoftMax layer in the last layer of your neural network. You may use CrossEntropyCriterion instead, if you prefer not to add an extra layer to your network. This criterion expects a class index (1 to the number of class) as target when calling forward(input, target) and backward(input, target).
In the log-probabilities case, The loss can be described as: loss(x, class) = -x[class] or in the case of the weights argument it is specified as follows: loss(x, class) = -weights[class] * x[class]
Due to the behaviour of the backend code, it is necessary to set sizeAverage to false when calculating losses in non-batch mode.
Note that if the target is
paddingValue, the training process will skip this sample. In other words, the forward process will return zero output and the backward process will also return zerogradInput.By default, the losses are averaged over observations for each minibatch. However, if the field sizeAverage is set to false, the losses are instead summed for each minibatch.
In particular, when weights=None, size_average=True and logProbAsInput=False, this is same as
sparse_categorical_crossentropyloss in keras.- T
numeric type
- Annotations
- @SerialVersionUID()
-
class
ClassSimplexCriterion[T] extends MSECriterion[T]
ClassSimplexCriterion implements a criterion for classification.
ClassSimplexCriterion implements a criterion for classification. It learns an embedding per class, where each class' embedding is a point on an (N-1)-dimensional simplex, where N is the number of classes.
- Annotations
- @SerialVersionUID()
-
class
Concat[T] extends DynamicContainer[Tensor[T], Tensor[T], T]
Concat concatenates the output of one layer of "parallel" modules along the provided
dimension: they take the same inputs, and their output is concatenated.Concat concatenates the output of one layer of "parallel" modules along the provided
dimension: they take the same inputs, and their output is concatenated. +-----------+ +----> module1 -----+ | | | | input -----+----> module2 -----+----> output | | | | +----> module3 -----+ +-----------+- Annotations
- @SerialVersionUID()
-
class
ConcatTable[T] extends DynamicContainer[Activity, Table, T] with MklInt8Convertible
ConcateTable is a container module like Concate.
ConcateTable is a container module like Concate. Applies an input to each member module, input can be a tensor or a table.
ConcateTable usually works with CAddTable and CMulTable to implement element wise add/multiply on outputs of two modules.
- Annotations
- @SerialVersionUID()
-
case class
ConstInitMethod(value: Double) extends InitializationMethod with Product with Serializable
Initializer that generates tensors with certain constant double.
-
abstract
class
Container[A <: Activity, B <: Activity, T] extends AbstractModule[A, B, T]
Container is an abstract AbstractModule class which declares methods defined in all containers.
Container is an abstract AbstractModule class which declares methods defined in all containers. A container usually contain some other modules in the
modulesvariable. It overrides many module methods such that calls are propagated to the contained modules.- A
Input data type
- B
Output data type
- T
Numeric type. Only support float/double now
- Annotations
- @SerialVersionUID()
-
class
Contiguous[T] extends TensorModule[T]
used to make input, gradOutput both contiguous
used to make input, gradOutput both contiguous
- Annotations
- @SerialVersionUID()
-
class
ConvLSTMPeephole[T] extends Cell[T]
Convolution Long Short Term Memory architecture with peephole.
Convolution Long Short Term Memory architecture with peephole. Ref. A.: https://arxiv.org/abs/1506.04214 (blueprint for this module) B. https://github.com/viorik/ConvLSTM
-
class
ConvLSTMPeephole3D[T] extends Cell[T]
Convolution Long Short Term Memory architecture with peephole.
Convolution Long Short Term Memory architecture with peephole. Ref. A.: https://arxiv.org/abs/1506.04214 (blueprint for this module) B. https://github.com/viorik/ConvLSTM
-
class
Cosine[T] extends TensorModule[T] with Initializable
Cosine calculates the cosine similarity of the input to k mean centers.
Cosine calculates the cosine similarity of the input to k mean centers. The input given in
forward(input)must be either a vector (1D tensor) or matrix (2D tensor). If the input is a vector, it must have the size ofinputSize. If it is a matrix, then each row is assumed to be an input sample of given batch (the number of rows means the batch size and the number of columns should be equal to theinputSize).- Annotations
- @SerialVersionUID()
-
class
CosineDistance[T] extends AbstractModule[Table, Tensor[T], T]
outputs the cosine distance between inputs
outputs the cosine distance between inputs
- Annotations
- @SerialVersionUID()
-
class
CosineDistanceCriterion[T] extends TensorCriterion[T]
Creates a criterion that measures the loss given an input tensor and target tensor.
Creates a criterion that measures the loss given an input tensor and target tensor.
The input and target are two tensors with same size. For instance:
x = Tensor[Double](Storage(Array(0.1, 0.2, 0.3))) y = Tensor[Double](Storage(Array(0.15, 0.25, 0.35)))
loss(x, y) = 1 - cos(x, y)
- Annotations
- @SerialVersionUID()
-
class
CosineEmbeddingCriterion[T] extends AbstractCriterion[Table, Table, T]
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors, and a Tensor label y with values 1 or -1.
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors, and a Tensor label y with values 1 or -1.
- Annotations
- @SerialVersionUID()
-
class
CosineProximityCriterion[T] extends TensorCriterion[T]
The negative of the mean cosine proximity between predictions and targets.
The negative of the mean cosine proximity between predictions and targets. The cosine proximity is defined as below: x'(i) = x(i) / sqrt(max(sum(x(i)2), 1e-12)) y'(i) = y(i) / sqrt(max(sum(x(i)2), 1e-12)) cosine_proximity(x, y) = mean(-1 * x'(i) * y'(i))
Both batch and un-batched inputs are supported
-
class
Cropping2D[T] extends TensorModule[T]
Cropping layer for 2D input (e.g.
Cropping layer for 2D input (e.g. picture). It crops along spatial dimensions, i.e. width and height. # Input shape 4D tensor with shape:
(batchSize, channels, first_axis_to_crop, second_axis_to_crop)# Output shape 4D tensor with shape:(batchSize, channels, first_cropped_axis, second_cropped_axis)- Annotations
- @SerialVersionUID()
-
class
Cropping3D[T] extends TensorModule[T]
Cropping layer for 3D data (e.g.
Cropping layer for 3D data (e.g. spatial or spatio-temporal).
# Input shape 5D tensor with shape: (batchSize, channels, first_axis_to_crop, second_axis_to_crop, third_axis_to_crop) # Output shape 5D tensor with shape: (batchSize, channels, first_cropped_axis, second_cropped_axis, third_cropped_axis)
-
class
CrossEntropyCriterion[T] extends TensorCriterion[T]
This criterion combines LogSoftMax and ClassNLLCriterion in one single class.
This criterion combines LogSoftMax and ClassNLLCriterion in one single class.
- Annotations
- @SerialVersionUID()
-
class
CrossProduct[T] extends AbstractModule[Table, Tensor[T], T]
A layer which takes a table of multiple tensors(n >= 2) as input and calculate to dot product for
all combinations of pairsamong input tensors.A layer which takes a table of multiple tensors(n >= 2) as input and calculate to dot product for
all combinations of pairsamong input tensors.
Dot-product outputs are ordered according to orders of pairs in input Table. For instance, input (Table) is T(A, B, C), output (Tensor) will be [A.*B, A.*C, B.*C].
Dimensions of input' Tensors could be one or two, if two, first dimension isbatchSize. For convenience, output is 2-dim Tensor regardless of input' dims.
Table size checking and Tensor size checking will be execute before each forward, when numTensor and embeddingSize are set values greater than zero. -
class
DenseToSparse[T] extends TensorModule[T]
Convert DenseTensor to SparseTensor.
Convert DenseTensor to SparseTensor.
- T
The numeric type in the criterion, usually which are Float or Double
-
class
DetectionOutputFrcnn extends AbstractModule[Table, Activity, Float]
Post process Faster-RCNN models
Post process Faster-RCNN models
- Annotations
- @SerialVersionUID()
- case class DetectionOutputParam(nClasses: Int = 21, shareLocation: Boolean = true, bgLabel: Int = 0, nmsThresh: Float = 0.45f, nmsTopk: Int = 400, keepTopK: Int = 200, confThresh: Float = 0.01f, varianceEncodedInTarget: Boolean = false) extends Product with Serializable
-
class
DetectionOutputSSD[T] extends AbstractModule[Table, Activity, T]
Layer to Post-process SSD output
Layer to Post-process SSD output
- T
Numeric type of parameter(e.g. weight, bias). Only support float/double now
- Annotations
- @SerialVersionUID()
-
class
DiceCoefficientCriterion[T] extends TensorCriterion[T]
The Dice-Coefficient criterion input: Tensor, target: Tensor
The Dice-Coefficient criterion input: Tensor, target: Tensor
return: 2 * (input intersection target) 1 - ---------------------------------- input union target
- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
-
class
DistKLDivCriterion[T] extends TensorCriterion[T]
The Kullback–Leibler divergence criterion
The Kullback–Leibler divergence criterion
- Annotations
- @SerialVersionUID()
-
class
DotProduct[T] extends AbstractModule[Table, Tensor[T], T]
This is a simple table layer which takes a table of two tensors as input and calculate the dot product between them as outputs
This is a simple table layer which takes a table of two tensors as input and calculate the dot product between them as outputs
- Annotations
- @SerialVersionUID()
-
class
DotProductCriterion[T] extends TensorCriterion[T]
Compute the dot product of input and target tensor.
Compute the dot product of input and target tensor. Input and target are required to have the same size.
- Annotations
- @SerialVersionUID()
-
class
Dropout[T] extends TensorModule[T]
Dropout masks(set to zero) parts of input using a bernoulli distribution.
Dropout masks(set to zero) parts of input using a bernoulli distribution. Each input element has a probability initP of being dropped. If
scaleis true(true by default), the outputs are scaled by a factor of1/(1-initP)during training. During evaluating, output is the same as input.It has been proven an effective approach for regularization and preventing co-adaptation of feature detectors. For more details, plese see [Improving neural networks by preventing co-adaptation of feature detectors] (https://arxiv.org/abs/1207.0580)
- Annotations
- @SerialVersionUID()
-
abstract
class
DynamicContainer[A <: Activity, B <: Activity, T] extends Container[A, B, T]
DynamicContainer allow user to change its submodules after create it.
DynamicContainer allow user to change its submodules after create it.
- A
Input data type
- B
Output data type
- T
Numeric type. Only support float/double now
-
class
ELU[T] extends TensorModule[T]
Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) [http://arxiv.org/pdf/1511.07289.pdf]
Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) [http://arxiv.org/pdf/1511.07289.pdf]
- Annotations
- @SerialVersionUID()
-
class
Echo[T] extends TensorModule[T]
This module is for debug purpose, which can print activation and gradient in your model topology
This module is for debug purpose, which can print activation and gradient in your model topology
User can pass in a customized function to inspect more information from the activation. This is very useful in Debug.
Please note that the passed in customized function will not be persisted in serialization.
- Annotations
- @SerialVersionUID()
-
class
Euclidean[T] extends TensorModule[T] with Initializable
Outputs the Euclidean distance of the input to
outputSizecentersOutputs the Euclidean distance of the input to
outputSizecenters- T
Numeric type. Only support float/double now
- Annotations
- @SerialVersionUID()
-
class
Exp[T] extends TensorModule[T]
Applies element-wise exp to input tensor.
Applies element-wise exp to input tensor.
- Annotations
- @SerialVersionUID()
-
class
ExpandSize[T] extends AbstractModule[Tensor[T], Tensor[T], T]
Expand tensor to configured size
Expand tensor to configured size
- T
Numeric type of parameter(e.g. weight, bias). Only support float/double now.
-
class
FPN[T] extends BaseModule[T]
Feature Pyramid Network.
-
class
FeedForwardNetwork[T] extends BaseModule[T]
Implementation FeedForwardNetwork constructed with fully connected network.
Implementation FeedForwardNetwork constructed with fully connected network. Input with shape (batch_size, length, hidden_size) Output with shape (batch_size, length, hidden_size)
-
class
FlattenTable[T] extends AbstractModule[Table, Table, T]
This is a table layer which takes an arbitrarily deep table of Tensors (potentially nested) as input and a table of Tensors without any nested table will be produced
This is a table layer which takes an arbitrarily deep table of Tensors (potentially nested) as input and a table of Tensors without any nested table will be produced
- Annotations
- @SerialVersionUID()
-
class
FrameManager[T] extends Serializable
Manage frame in scheduler.
Manage frame in scheduler. When scheduler execute nodes, it may enter a
frame. Before scheduler leave a frame, it must make sure all nodes in that frames has been run. -
class
GRU[T] extends Cell[T]
Gated Recurrent Units architecture.
Gated Recurrent Units architecture. The first input in sequence uses zero value for cell and hidden state
Ref. 1. http://www.wildml.com/2015/10/ recurrent-neural-network-tutorial-part-4-implementing-a-grulstm-rnn-with-python-and-theano/
2. https://github.com/Element-Research/rnn/blob/master/GRU.lua
- Annotations
- @SerialVersionUID()
-
class
GaussianCriterion[T] extends AbstractCriterion[Table, Tensor[T], T]
Computes the log-likelihood of a sample x given a Gaussian distribution p.
-
class
GaussianDropout[T] extends TensorModule[T]
Apply multiplicative 1-centered Gaussian noise.
Apply multiplicative 1-centered Gaussian noise. The multiplicative noise will have standard deviation
sqrt(rate / (1 - rate)).As it is a regularization layer, it is only active at training time.
Output shape is the same as input.
- Annotations
- @SerialVersionUID()
-
class
GaussianNoise[T] extends TensorModule[T]
Apply additive zero-centered Gaussian noise.
Apply additive zero-centered Gaussian noise. This is useful to mitigate overfitting (you could see it as a form of random data augmentation). Gaussian Noise (GS) is a natural choice as corruption process for real valued inputs. As it is a regularization layer, it is only active at training time.
Output shape is the same as input.
- Annotations
- @SerialVersionUID()
-
class
GaussianSampler[T] extends AbstractModule[Table, Tensor[T], T]
Takes {mean, log_variance} as input and samples from the Gaussian distribution
-
class
GradientReversal[T] extends TensorModule[T]
It is a simple module preserves the input, but takes the gradient from the subsequent layer, multiplies it by -lambda and passes it to the preceding layer.
It is a simple module preserves the input, but takes the gradient from the subsequent layer, multiplies it by -lambda and passes it to the preceding layer. This can be used to maximise an objective function whilst using gradient descent, as described in ["Domain-Adversarial Training of Neural Networks" (http://arxiv.org/abs/1505.07818)]
- Annotations
- @SerialVersionUID()
-
abstract
class
Graph[T] extends Container[Activity, Activity, T] with MklInt8Convertible
A graph container.
A graph container. The modules in the container are connected as a directed Graph. Each module can output one tensor or multiple tensors(as table). The edges between modules in the graph define how these tensors are passed. For example, if a module outputs two tensors, you can pass these two tensors together to its following module, or pass only one of them to its following module. If a tensor in the module output is connected to multiple modules, in the back propagation, the gradients from multiple connection will be accumulated. If multiple edges point to one module, the tensors from these edges will be stack as a table, then pass to that module. In the back propagation, the gradients will be splited based on how the input tensors stack.
The graph container has multiple inputs and multiple outputs. The order of the input tensors should be same with the order of the input nodes when you construct the graph container. In the back propagation, the order of the gradients tensors should be the same with the order of the output nodes.
If there's one output, the module output is a tensor. If there're multiple outputs, the module output is a table, which is actually an sequence of tensor. The order of the output tensors is same with the order of the output modules.
All inputs should be able to connect to outputs through some paths in the graph. It is allowed that some successors of the inputs node are not connect to outputs. If so, these nodes will be excluded in the computation.
- T
Numeric type. Only support float/double now
- Annotations
- @SerialVersionUID()
- trait GraphSerializable extends ContainerSerializable
-
class
HardShrink[T] extends TensorModule[T]
This is a transfer layer which applies the hard shrinkage function element-wise to the input Tensor.
This is a transfer layer which applies the hard shrinkage function element-wise to the input Tensor. The parameter lambda is set to 0.5 by default ⎧ x, if x > lambda f(x) = ⎨ x, if x < -lambda ⎩ 0, otherwise
- Annotations
- @SerialVersionUID()
-
class
HardSigmoid[T] extends TensorModule[T]
Apply Segment-wise linear approximation of sigmoid.
Apply Segment-wise linear approximation of sigmoid. Faster than sigmoid ⎧ 0, if x < -2.5 f(x) = ⎨ 1, if x > 2.5 ⎩ 0.2 * x + 0.5, otherwise
-
class
HardTanh[T] extends TensorModule[T]
Applies HardTanh to each element of input, HardTanh is defined: ⎧ maxValue, if x > maxValue f(x) = ⎨ minValue, if x < minValue ⎩ x, otherwise
Applies HardTanh to each element of input, HardTanh is defined: ⎧ maxValue, if x > maxValue f(x) = ⎨ minValue, if x < minValue ⎩ x, otherwise
- Annotations
- @SerialVersionUID()
-
class
HingeEmbeddingCriterion[T] extends TensorCriterion[T]
Creates a criterion that measures the loss given an input x which is a 1-dimensional vector and a label y (1 or -1).
Creates a criterion that measures the loss given an input x which is a 1-dimensional vector and a label y (1 or -1). This is usually used for measuring whether two inputs are similar or dissimilar, e.g. using the L1 pairwise distance, and is typically used for learning nonlinear embeddings or semi-supervised learning.
⎧ x_i, if y_i == 1 loss(x, y) = 1/n ⎨ ⎩ max(0, margin - x_i), if y_i == -1
If x and y are n-dimensional Tensors, the sum operation still operates over all the elements, and divides by n (this can be avoided if one sets the internal variable sizeAverage to false). The margin has a default value of 1, or can be set in the constructor.
- Annotations
- @SerialVersionUID()
-
class
Identity[T] extends AbstractModule[Activity, Activity, T]
Identity just return the input to output.
Identity just return the input to output. It's useful in same parallel container to get an origin input.
- Annotations
- @SerialVersionUID()
-
class
Index[T] extends AbstractModule[Table, Tensor[T], T]
Applies the Tensor index operation along the given dimension.
Applies the Tensor index operation along the given dimension.
- Annotations
- @SerialVersionUID()
-
class
InferReshape[T] extends TensorModule[T]
Reshape the input tensor with automatic size inference support.
Reshape the input tensor with automatic size inference support. Positive numbers in the
sizeargument are used to reshape the input to the corresponding dimension size. There are also two special values allowed insize:0means keep the corresponding dimension size of the input unchanged. i.e., if the 1st dimension size of the input is 2, the 1st dimension size of output will be set as 2 as well. b.-1means infer this dimension size from other dimensions. This dimension size is calculated by keeping the amount of output elements consistent with the input. Only one-1is allowable insize.
For example, Input tensor with size: (4, 5, 6, 7) -> InferReshape(Array(4, 0, 3, -1)) Output tensor with size: (4, 5, 3, 14) The 1st and 3rd dim are set to given sizes, keep the 2nd dim unchanged, and inferred the last dim as 14.
- T
Numeric type (Float and Double are allowed)
-
trait
InitializationMethod extends AnyRef
Initialization method to initialize bias and weight.
Initialization method to initialize bias and weight. The init method will be called in Module.reset()
-
class
Input[T] extends AbstractModule[Activity, Activity, T]
Input layer do nothing to the input tensors, just pass them.
Input layer do nothing to the input tensors, just pass them. It should be used as input node when the first layer of your module accepts multiple tensors as inputs.
Each input node of the graph container should accept one tensor as input. If you want a module accepting multiple tensors as input, you should add some Input module before it and connect the outputs of the Input nodes to it.
Please note that the return is not a layer but a Node containing input layer.
- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
-
class
JoinTable[T] extends AbstractModule[Table, Tensor[_], T]
It is a table module which takes a table of Tensors as input and outputs a Tensor by joining them together along the dimension
dimension.It is a table module which takes a table of Tensors as input and outputs a Tensor by joining them together along the dimension
dimension.The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using
nInputDims.- Annotations
- @SerialVersionUID()
-
class
KLDCriterion[T] extends AbstractCriterion[Table, Tensor[T], T]
Computes the KL-divergence of the input normal distribution to a standard normal distribution.
Computes the KL-divergence of the input normal distribution to a standard normal distribution. The input has to be a table. The first element of input is the mean of the distribution, the second element of input is the log_variance of the distribution. The input distribution is assumed to be diagonal.
The mean and log_variance are both assumed to be two dimensional tensors. The first dimension are interpreted as batch. The output is the average/sum of each observation.
-
class
KullbackLeiblerDivergenceCriterion[T] extends TensorCriterion[T]
This method is same as
kullback_leibler_divergenceloss in keras.This method is same as
kullback_leibler_divergenceloss in keras. Loss calculated as: y_true = K.clip(y_true, K.epsilon(), 1) y_pred = K.clip(y_pred, K.epsilon(), 1) and output K.sum(y_true * K.log(y_true / y_pred), axis=-1)- T
The numeric type in the criterion, usually which are Float or Double
-
class
L1Cost[T] extends TensorCriterion[T]
compute L1 norm for input, and sign of input
compute L1 norm for input, and sign of input
- Annotations
- @SerialVersionUID()
-
class
L1HingeEmbeddingCriterion[T] extends AbstractCriterion[Table, Tensor[T], T]
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors, and a label y (1 or -1):
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors, and a label y (1 or -1):
- Annotations
- @SerialVersionUID()
-
class
L1Penalty[T] extends TensorModule[T]
adds an L1 penalty to an input (for sparsity).
adds an L1 penalty to an input (for sparsity). L1Penalty is an inline module that in its forward propagation copies the input Tensor directly to the output, and computes an L1 loss of the latent state (input) and stores it in the module's loss field. During backward propagation: gradInput = gradOutput + gradLoss.
- Annotations
- @SerialVersionUID()
-
class
LSTM[T] extends Cell[T]
Long Short Term Memory architecture.
Long Short Term Memory architecture. Ref. A.: http://arxiv.org/pdf/1303.5778v1 (blueprint for this module) B. http://web.eecs.utk.edu/~itamar/courses/ECE-692/Bobby_paper1.pdf C. http://arxiv.org/pdf/1503.04069v1.pdf D. https://github.com/wojzaremba/lstm
- Annotations
- @SerialVersionUID()
-
class
LSTMPeephole[T] extends Cell[T]
Long Short Term Memory architecture with peephole.
Long Short Term Memory architecture with peephole. Ref. A.: http://arxiv.org/pdf/1303.5778v1 (blueprint for this module) B. http://web.eecs.utk.edu/~itamar/courses/ECE-692/Bobby_paper1.pdf C. http://arxiv.org/pdf/1503.04069v1.pdf D. https://github.com/wojzaremba/lstm
- Annotations
- @SerialVersionUID()
-
class
LayerNormalization[T] extends BaseModule[T]
Applies layer normalization.
-
class
LeakyReLU[T] extends TensorModule[T]
It is a transfer module that applies LeakyReLU, which parameter negval sets the slope of the negative part: LeakyReLU is defined as: f(x) = max(0, x) + negval * min(0, x)
It is a transfer module that applies LeakyReLU, which parameter negval sets the slope of the negative part: LeakyReLU is defined as: f(x) = max(0, x) + negval * min(0, x)
- Annotations
- @SerialVersionUID()
-
class
Linear[T] extends TensorModule[T] with Initializable with MklInt8Convertible
The
Linearmodule applies a linear transformation to the input data, i.e.The
Linearmodule applies a linear transformation to the input data, i.e.y = Wx + b. Theinputgiven inforward(input)must be either a vector (1D tensor) or matrix (2D tensor). If the input is a vector, it must have the size ofinputSize. If it is a matrix, then each row is assumed to be an input sample of given batch (the number of rows means the batch size and the number of columns should be equal to theinputSize).- Annotations
- @SerialVersionUID()
- class LocallyConnected1D[T] extends TensorModule[T] with Initializable
-
class
LocallyConnected2D[T] extends TensorModule[T] with Initializable
The LocallyConnected2D layer works similarly to the SpatialConvolution layer, except that weights are unshared, that is, a different set of filters is applied at each different patch of the input.
The LocallyConnected2D layer works similarly to the SpatialConvolution layer, except that weights are unshared, that is, a different set of filters is applied at each different patch of the input.
- T
The numeric type in the criterion, usually which are Float or Double
-
class
Log[T] extends TensorModule[T]
The Log module applies a log transformation to the input data
The Log module applies a log transformation to the input data
- Annotations
- @SerialVersionUID()
-
class
LogSigmoid[T] extends TensorModule[T]
This class is a transform layer corresponding to the sigmoid function: f(x) = Log(1 / (1 + e ^^ (-x)))
This class is a transform layer corresponding to the sigmoid function: f(x) = Log(1 / (1 + e ^^ (-x)))
- Annotations
- @SerialVersionUID()
-
class
LogSoftMax[T] extends TensorModule[T]
The LogSoftMax module applies a LogSoftMax transformation to the input data which is defined as: f_i(x) = log(1 / a exp(x_i)) where a = sum_j[exp(x_j)]
The LogSoftMax module applies a LogSoftMax transformation to the input data which is defined as: f_i(x) = log(1 / a exp(x_i)) where a = sum_j[exp(x_j)]
The input given in
forward(input)must be either a vector (1D tensor) or matrix (2D tensor).- Annotations
- @SerialVersionUID()
-
class
LookupTable[T] extends TensorModule[T] with Initializable
This layer is a particular case of a convolution, where the width of the convolution would be 1.
This layer is a particular case of a convolution, where the width of the convolution would be 1. Input should be a 1D or 2D tensor filled with indices. Indices are corresponding to the position in weight. For each index element of input, it outputs the selected index part of weight. Elements of input should be in range of (1, nIndex) This layer is often used in word embedding.
- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
-
class
LookupTableSparse[T] extends AbstractModule[Activity, Tensor[T], T] with Initializable
LookupTable for multi-values.
LookupTable for multi-values. Also called embedding_lookup_sparse in TensorFlow.
The input of LookupTableSparse should be a 2D SparseTensor or two 2D sparseTensors. If the input is a SparseTensor, the values are positive integer ids, values in each row of this SparseTensor will be turned into a dense vector. If the input is two SparseTensors, the first tensor should be the integer ids, just like the SparseTensor input. And the second tensor is the corresponding weights of the integer ids.
-
class
MM[T] extends AbstractModule[Table, Tensor[T], T]
Module to perform matrix multiplication on two mini-batch inputs, producing a mini-batch.
Module to perform matrix multiplication on two mini-batch inputs, producing a mini-batch.
- Annotations
- @SerialVersionUID()
-
class
MSECriterion[T] extends TensorCriterion[T]
The mean squared error criterion e.g.
The mean squared error criterion e.g. input: a, target: b, total elements: n loss(a, b) = 1/n \sum |a_i - b_i|^2 sizeAverage is true by default to divide the sum of squared error by n
- Annotations
- @SerialVersionUID()
-
class
MV[T] extends AbstractModule[Table, Tensor[T], T]
It is a module to perform matrix vector multiplication on two mini-batch inputs, producing a mini-batch.
It is a module to perform matrix vector multiplication on two mini-batch inputs, producing a mini-batch.
- Annotations
- @SerialVersionUID()
-
class
MapTable[T] extends DynamicContainer[Table, Table, T]
This class is a container for a single module which will be applied to all input elements.
This class is a container for a single module which will be applied to all input elements. The member module is cloned as necessary to process all input elements.
- Annotations
- @SerialVersionUID()
-
class
MarginCriterion[T] extends TensorCriterion[T]
Creates a criterion that optimizes a two-class classification (squared) hinge loss (margin-based loss) between input x (a Tensor of dimension 1) and output y.
Creates a criterion that optimizes a two-class classification (squared) hinge loss (margin-based loss) between input x (a Tensor of dimension 1) and output y.
When margin = 1, sizeAverage = True and squared = False, this is the same as hinge loss in keras; When margin = 1, sizeAverage = False and squared = True, this is the same as squared_hinge loss in keras.
- Annotations
- @SerialVersionUID()
-
class
MarginRankingCriterion[T] extends AbstractCriterion[Table, Table, T]
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors of size 1 (they contain only scalars), and a label y (1 or -1).
Creates a criterion that measures the loss given an input x = {x1, x2}, a table of two Tensors of size 1 (they contain only scalars), and a label y (1 or -1). In batch mode, x is a table of two Tensors of size batchsize, and y is a Tensor of size batchsize containing 1 or -1 for each corresponding pair of elements in the input Tensor. If y == 1 then it assumed the first input should be ranked higher (have a larger value) than the second input, and vice-versa for y == -1.
- Annotations
- @SerialVersionUID()
- class MaskHead extends BaseModule[Float]
-
class
MaskedSelect[T] extends AbstractModule[Table, Tensor[T], T]
Performs a torch.MaskedSelect on a Tensor.
Performs a torch.MaskedSelect on a Tensor. The mask is supplied as a tabular argument with the input on the forward and backward passes.
- Annotations
- @SerialVersionUID()
-
class
Masking[T] extends TensorModule[T]
Masking Use a mask value to skip timesteps for a sequence
-
class
Max[T] extends TensorModule[T]
Applies a max operation over dimension
dimApplies a max operation over dimension
dim- Annotations
- @SerialVersionUID()
-
class
Maxout[T] extends TensorModule[T]
Maxout A linear maxout layer Maxout layer select the element-wise maximum value of maxoutNumber Linear(inputSize, outputSize) layers
-
class
Mean[T] extends Sum[T]
It is a simple layer which applies a mean operation over the given dimension.
It is a simple layer which applies a mean operation over the given dimension. When nInputDims is provided, the input will be considered as batches. Then the mean operation will be applied in (dimension + 1).
The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using
nInputDims.- Annotations
- @SerialVersionUID()
-
class
MeanAbsolutePercentageCriterion[T] extends TensorCriterion[T]
This method is same as
mean_absolute_percentage_errorloss in keras.This method is same as
mean_absolute_percentage_errorloss in keras. It caculates diff = K.abs((y - x) / K.clip(K.abs(y), K.epsilon(), Double.MaxValue)) and return 100 * K.mean(diff) as outpout Here, the x and y can have or not have a batch.- T
The numeric type in the criterion, usually which are Float or Double
-
class
MeanSquaredLogarithmicCriterion[T] extends TensorCriterion[T]
This method is same as
mean_squared_logarithmic_errorloss in keras.This method is same as
mean_squared_logarithmic_errorloss in keras. It calculates: first_log = K.log(K.clip(y, K.epsilon(), Double.MaxValue) + 1.) second_log = K.log(K.clip(x, K.epsilon(), Double.MaxValue) + 1.) and output K.mean(K.square(first_log - second_log)) Here, the x and y can have or not have a batch.- T
The numeric type in the criterion, usually which are Float or Double
-
class
Min[T] extends TensorModule[T]
Applies a min operation over dimension
dim.Applies a min operation over dimension
dim.- Annotations
- @SerialVersionUID()
-
class
MixtureTable[T] extends AbstractModule[Table, Tensor[T], T]
Creates a module that takes a table {gater, experts} as input and outputs the mixture of experts (a Tensor or table of Tensors) using a gater Tensor.
Creates a module that takes a table {gater, experts} as input and outputs the mixture of experts (a Tensor or table of Tensors) using a gater Tensor. When dim is provided, it specifies the dimension of the experts Tensor that will be interpolated (or mixed). Otherwise, the experts should take the form of a table of Tensors. This Module works for experts of dimension 1D or more, and for a 1D or 2D gater, i.e. for single examples or mini-batches.
- T
Numeric type. Only support float/double now
- Annotations
- @SerialVersionUID()
-
trait
MklInt8Convertible extends AnyRef
Trait which provides MKL-DNN functionality to convert from FP32 to INT8
-
case class
MsraFiller(varianceNormAverage: Boolean = true) extends InitializationMethod with Product with Serializable
A Filler based on the paper [He, Zhang, Ren and Sun 2015]: Specifically accounts for ReLU nonlinearities.
A Filler based on the paper [He, Zhang, Ren and Sun 2015]: Specifically accounts for ReLU nonlinearities.
Aside: for another perspective on the scaling factor, see the derivation of [Saxe, McClelland, and Ganguli 2013 (v3)].
It fills the incoming matrix by randomly sampling Gaussian data with std = sqrt(2 / n) where n is the fanIn, fanOut, or their average, depending on the varianceNormAverage parameter.
- varianceNormAverage
VarianceNorm use average of (fanIn + fanOut) or just fanOut
-
class
Mul[T] extends TensorModule[T] with Initializable
multiply a single scalar factor to the incoming data
multiply a single scalar factor to the incoming data
- Annotations
- @SerialVersionUID()
-
class
MulConstant[T] extends TensorModule[T]
Multiplies input Tensor by a (non-learnable) scalar constant.
Multiplies input Tensor by a (non-learnable) scalar constant. This module is sometimes useful for debugging purposes.
- Annotations
- @SerialVersionUID()
-
class
MultiCriterion[T] extends AbstractCriterion[Activity, Activity, T]
a weighted sum of other criterions each applied to the same input and target;
a weighted sum of other criterions each applied to the same input and target;
- Annotations
- @SerialVersionUID()
-
class
MultiLabelMarginCriterion[T] extends TensorCriterion[T]
Creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss) between input x and output y (which is a Tensor of target class indices)
Creates a criterion that optimizes a multi-class multi-classification hinge loss (margin-based loss) between input x and output y (which is a Tensor of target class indices)
- Annotations
- @SerialVersionUID()
-
class
MultiLabelSoftMarginCriterion[T] extends TensorCriterion[T]
A MultiLabel multiclass criterion based on sigmoid:
A MultiLabel multiclass criterion based on sigmoid:
the loss is: l(x,y) = - sum_i y[i] * log(p[i]) + (1 - y[i]) * log (1 - p[i]) where p[i] = exp(x[i]) / (1 + exp(x[i]))
and with weights: l(x,y) = - sum_i weights[i] (y[i] * log(p[i]) + (1 - y[i]) * log (1 - p[i]))
- Annotations
- @SerialVersionUID()
-
class
MultiMarginCriterion[T] extends TensorCriterion[T]
Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input x and output y (which is a target class index).
Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input x and output y (which is a target class index).
- Annotations
- @SerialVersionUID()
-
class
MultiRNNCell[T] extends Cell[T]
Enable user stack multiple simple cells.
-
class
Narrow[T] extends TensorModule[T]
Narrow is application of narrow operation in a module.
Narrow is application of narrow operation in a module. The module further supports a negative length in order to handle inputs with an unknown size.
- Annotations
- @SerialVersionUID()
-
class
NarrowTable[T] extends AbstractModule[Table, Table, T]
Creates a module that takes a table as input and outputs the subtable starting at index offset having length elements (defaults to 1 element).
Creates a module that takes a table as input and outputs the subtable starting at index offset having length elements (defaults to 1 element). The elements can be either a table or a Tensor. If
lengthis negative, it means selecting the elements from the offset to element which located at the abs(length) to the last element of the input.- Annotations
- @SerialVersionUID()
-
class
Negative[T] extends AbstractModule[Tensor[_], Tensor[_], T]
Computing negative value of each element of input tensor
Computing negative value of each element of input tensor
- T
Numeric type of parameter(e.g. weight, bias). Only support float/double now
-
class
NegativeEntropyPenalty[T] extends TensorModule[T]
Penalize the input multinomial distribution if it has low entropy.
Penalize the input multinomial distribution if it has low entropy. The input to this layer should be a batch of vector each representing a multinomial distribution. The input is typically the output of a softmax layer.
For forward, the output is the same as input and a NegativeEntropy loss of the latent state will be calculated each time. For backward, gradInput = gradOutput + gradLoss
This can be used in reinforcement learning to discourage the policy from collapsing to a single action for a given state, which improves exploration. See the A3C paper for more detail (https://arxiv.org/pdf/1602.01783.pdf).
- Annotations
- @SerialVersionUID()
-
class
Nms extends Serializable
Non-Maximum Suppression (nms) for Object Detection The goal of nms is to solve the problem that groups of several detections near the real location, ideally obtaining only one detection per object
-
class
Normalize[T] extends TensorModule[T]
Normalizes the input Tensor to have unit L_p norm.
Normalizes the input Tensor to have unit L_p norm. The smoothing parameter eps prevents division by zero when the input contains all zero elements (default = 1e-10). The input can be 1d, 2d or 4d If the input is 4d, it should follow the format (n, c, h, w) where n is the batch number, c is the channel number, h is the height and w is the width
- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
-
class
NormalizeScale[T] extends TensorModule[T]
NormalizeScale is conposed of normalize and scale, this is equal to caffe Normalize layer
NormalizeScale is conposed of normalize and scale, this is equal to caffe Normalize layer
- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
-
class
PGCriterion[T] extends TensorCriterion[T]
The Criterion to compute the negative policy gradient given a multinomial distribution and the sampled action and reward.
The Criterion to compute the negative policy gradient given a multinomial distribution and the sampled action and reward.
The input to this criterion should be a 2-D tensor representing a batch of multinomial distribution, the target should also be a 2-D tensor with the same size of input, representing the sampled action and reward/advantage with the index of non-zero element in the vector represents the sampled action and the non-zero element itself represents the reward. If the action is space is large, you should consider using SparseTensor for target.
The loss computed is simple the standard policy gradient,
loss = - 1/n * sum(R_{n} dot_product log(P_{n}))
where R_{n} is the reward vector, and P_{n} is the input distribution.
- Annotations
- @SerialVersionUID()
-
class
PReLU[T] extends TensorModule[T] with Initializable
Applies parametric ReLU, which parameter varies the slope of the negative part.
Applies parametric ReLU, which parameter varies the slope of the negative part.
PReLU: f(x) = max(0, x) + a * min(0, x)
nOutputPlane's default value is 0, that means using PReLU in shared version and has only one parameters.
Notice: Please don't use weight decay on this.
- Annotations
- @SerialVersionUID()
-
class
Pack[T] extends AbstractModule[Activity, Tensor[_], T]
Stacks a list of n-dimensional tensors into one (n+1)-dimensional tensor.
Stacks a list of n-dimensional tensors into one (n+1)-dimensional tensor.
- T
Numeric type. Only support float/double now
- Annotations
- @SerialVersionUID()
-
class
Padding[T] extends TensorModule[T]
This module adds pad units of padding to dimension dim of the input.
This module adds pad units of padding to dimension dim of the input. If pad is negative, padding is added to the left, otherwise, it is added to the right of the dimension.
The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using nInputDims.
- Annotations
- @SerialVersionUID()
-
class
PairwiseDistance[T] extends AbstractModule[Table, Tensor[T], T]
It is a module that takes a table of two vectors as input and outputs the distance between them using the p-norm.
It is a module that takes a table of two vectors as input and outputs the distance between them using the p-norm. The input given in
forward(input)is a Table that contains two tensors which must be either a vector (1D tensor) or matrix (2D tensor). If the input is a vector, it must have the size ofinputSize. If it is a matrix, then each row is assumed to be an input sample of the given batch (the number of rows means the batch size and the number of columns should be equal to theinputSize).- Annotations
- @SerialVersionUID()
-
class
ParallelCriterion[T] extends AbstractCriterion[Table, Table, T]
ParallelCriterion is a weighted sum of other criterions each applied to a different input and target.
ParallelCriterion is a weighted sum of other criterions each applied to a different input and target. Set repeatTarget = true to share the target for criterions.
Use add(criterion[, weight]) method to add criterion. Where weight is a scalar(default 1).
- Annotations
- @SerialVersionUID()
-
class
ParallelTable[T] extends DynamicContainer[Table, Table, T]
It is a container module that applies the i-th member module to the i-th input, and outputs an output in the form of Table
It is a container module that applies the i-th member module to the i-th input, and outputs an output in the form of Table
- Annotations
- @SerialVersionUID()
-
class
PoissonCriterion[T] extends TensorCriterion[T]
This class is same as
Poissonloss in keras.This class is same as
Poissonloss in keras. Loss calculated as: K.mean(y_pred - y_true * K.log(y_pred + K.epsilon()), axis=-1)- T
The numeric type in the criterion, usually which are Float or Double
-
class
Pooler[T] extends AbstractModule[Table, Tensor[T], T]
Pooler selects the feature map which matches the size of RoI for RoIAlign
-
class
Power[T] extends TensorModule[T]
Apply an element-wise power operation with scale and shift.
Apply an element-wise power operation with scale and shift.
f(x) = (shift + scale * x)power
- Annotations
- @SerialVersionUID()
-
class
PriorBox[T] extends AbstractModule[Activity, Tensor[T], T]
Generate the prior boxes of designated sizes and aspect ratios across all dimensions (H * W) Intended for use with MultiBox detection method to generate prior
Generate the prior boxes of designated sizes and aspect ratios across all dimensions (H * W) Intended for use with MultiBox detection method to generate prior
- T
Numeric type. Only support float/double now
- Annotations
- @SerialVersionUID()
-
class
Proposal extends AbstractModule[Table, Tensor[Float], Float]
Outputs object detection proposals by applying estimated bounding-box transformations to a set of regular boxes (called "anchors").
Outputs object detection proposals by applying estimated bounding-box transformations to a set of regular boxes (called "anchors"). rois: holds R regions of interest, each is a 5-tuple (n, x1, y1, x2, y2) specifying an image batch index n and a rectangle (x1, y1, x2, y2) scores: holds scores for R regions of interest
- Annotations
- @SerialVersionUID()
-
class
RReLU[T] extends TensorModule[T]
Applies the randomized leaky rectified linear unit (RReLU) element-wise to the input Tensor, thus outputting a Tensor of the same dimension.
Applies the randomized leaky rectified linear unit (RReLU) element-wise to the input Tensor, thus outputting a Tensor of the same dimension. Informally the RReLU is also known as 'insanity' layer. RReLU is defined as: f(x) = max(0,x) + a * min(0, x) where a ~ U(l, u). In training mode negative inputs are multiplied by a factor drawn from a uniform random distribution U(l, u). In evaluation mode a RReLU behaves like a LeakyReLU with a constant mean factor a = (l + u) / 2. By default, l = 1/8 and u = 1/3. If l == u a RReLU effectively becomes a LeakyReLU. Regardless of operating in in-place mode a RReLU will internally allocate an input-sized noise tensor to store random factors for negative inputs. The backward() operation assumes that forward() has been called before. For reference see [Empirical Evaluation of Rectified Activations in Convolutional Network](http://arxiv.org/abs/1505.00853).
- T
data type
- Annotations
- @SerialVersionUID()
-
case class
RandomNormal(mean: Double, stdv: Double) extends InitializationMethod with Product with Serializable
Initializer that generates tensors with a normal distribution.
-
case class
RandomUniform(lower: Double, upper: Double) extends InitializationMethod with Product with Serializable
Initializer that generates tensors with a uniform distribution.
Initializer that generates tensors with a uniform distribution.
It draws samples from a uniform distribution within [lower, upper]
-
class
ReLU[T] extends Threshold[T] with MklInt8Convertible
Applies the rectified linear unit (ReLU) function element-wise to the input Tensor Thus the output is a Tensor of the same dimension ReLU function is defined as: f(x) = max(0, x)
Applies the rectified linear unit (ReLU) function element-wise to the input Tensor Thus the output is a Tensor of the same dimension ReLU function is defined as: f(x) = max(0, x)
- Annotations
- @SerialVersionUID()
-
class
ReLU6[T] extends HardTanh[T]
Same as ReLU except that the rectifying function f(x) saturates at x = 6 ReLU6 is defined as:
f(x) = min(max(0, x), 6)Same as ReLU except that the rectifying function f(x) saturates at x = 6 ReLU6 is defined as:
f(x) = min(max(0, x), 6)- Annotations
- @SerialVersionUID()
-
class
Recurrent[T] extends DynamicContainer[Tensor[T], Tensor[T], T]
Recurrent module is a container of rnn cells Different types of rnn cells can be added using add() function
Recurrent module is a container of rnn cells Different types of rnn cells can be added using add() function
The recurrent includes some mask mechanisms if the
maskZerovariable is set to true, theRecurrentmodule will not consider zero vector inputs. For each time step input, if a certain row is a zero vector (all the elements of the vector equals zero), then output of certain row of this time step would be a zero vector, and the hidden state of the certain row of this time step would be the same as the corresponding row of the hidden state of the previous step. -
class
RecurrentDecoder[T] extends Recurrent[T]
RecurrentDecoder module is a container of rnn cells that used to make a prediction of the next timestep based on the prediction we made from the previous timestep.
RecurrentDecoder module is a container of rnn cells that used to make a prediction of the next timestep based on the prediction we made from the previous timestep. Input for RecurrentDecoder is dynamically composed during training. input at t(i) is output at t(i-1), input at t(0) is user input, and user input has to be batch x stepShape(shape of the input at a single time step).
Different types of rnn cells can be added using add() function.
-
class
RegionProposal extends AbstractModule[Table, Table, Float]
Layer for RPN computation.
Layer for RPN computation. Takes feature maps from the backbone and outputs RPN proposals and losses.
-
class
Replicate[T] extends TensorModule[T]
Replicate repeats input
nFeaturestimes along itsdimdimensionReplicate repeats input
nFeaturestimes along itsdimdimensionNotice: No memory copy, it set the stride along the
dim-th dimension to zero.- Annotations
- @SerialVersionUID()
-
class
Reshape[T] extends TensorModule[T]
The
forward(input)reshape the input tensor into asize(0) * size(1) * ...tensor, taking the elements row-wise.The
forward(input)reshape the input tensor into asize(0) * size(1) * ...tensor, taking the elements row-wise.- Annotations
- @SerialVersionUID()
-
class
ResizeBilinear[T] extends AbstractModule[Tensor[Float], Tensor[Float], T]
Resize the input image with bilinear interpolation.
Resize the input image with bilinear interpolation. The input image must be a float tensor with NHWC or NCHW layout.
- T
Numeric type of parameter(e.g. weight, bias). Only support float/double now
-
class
Reverse[T] extends TensorModule[T]
Reverse the input w.r.t given dimension.
Reverse the input w.r.t given dimension. The input can be a Tensor or Table.
- T
Numeric type. Only support float/double now
-
class
RnnCell[T] extends Cell[T]
Implementation of vanilla recurrent neural network cell i2h: weight matrix of input to hidden units h2h: weight matrix of hidden units to themselves through time The updating is defined as: h_t = f(i2h * x_t + h2h * h_{t-1})
-
class
RoiAlign[T] extends AbstractModule[Activity, Tensor[T], T]
Region of interest aligning (RoIAlign) for Mask-RCNN
Region of interest aligning (RoIAlign) for Mask-RCNN
The RoIAlign uses average pooling on bilinear-interpolated sub-windows to convert the features inside any valid region of interest into a small feature map with a fixed spatial extent of pooledH * pooledW (e.g., 7 * 7). An RoI is a rectangular window into a conv feature map. Each RoI is defined by a four-tuple (x1, y1, x2, y2) that specifies its top-left corner (x1, y1) and its bottom-right corner (x2, y2). RoIAlign works by dividing the h * w RoI window into an pooledH * pooledW grid of sub-windows of approximate size h/H * w/W. In each sub-window, compute exact values of input features at four regularly sampled locations, and then do average pooling on the values in each sub-window. Pooling is applied independently to each feature map channel
-
class
RoiPooling[T] extends AbstractModule[Table, Tensor[T], T]
Region of interest pooling The RoIPooling uses max pooling to convert the features inside any valid region of interest into a small feature map with a fixed spatial extent of pooledH × pooledW (e.g., 7 × 7) an RoI is a rectangular window into a conv feature map.
Region of interest pooling The RoIPooling uses max pooling to convert the features inside any valid region of interest into a small feature map with a fixed spatial extent of pooledH × pooledW (e.g., 7 × 7) an RoI is a rectangular window into a conv feature map. Each RoI is defined by a four-tuple (x1, y1, x2, y2) that specifies its top-left corner (x1, y1) and its bottom-right corner (x2, y2). RoI max pooling works by dividing the h × w RoI window into an pooledH × pooledW grid of sub-windows of approximate size h/H × w/W and then max-pooling the values in each sub-window into the corresponding output grid cell. Pooling is applied independently to each feature map channel
- T
Numeric type. Only support float/double now
-
class
SReLU[T] extends TensorModule[T] with Initializable
S-shaped Rectified Linear Unit.
S-shaped Rectified Linear Unit. It follows:
f(x) = tr + ar(x - tr) for x >= tr,f(x) = x for tr > x > tl,f(x) = tl + al(x - tl) for x <= tl.[Deep Learning with S-shaped Rectified Linear Activation Units](http://arxiv.org/abs/1512.07030)
- Annotations
- @SerialVersionUID()
-
class
Scale[T] extends AbstractModule[Tensor[T], Tensor[T], T]
Scale is the combination of cmul and cadd Computes the elementwise product of input and weight, with the shape of the weight "expand" to match the shape of the input.
Scale is the combination of cmul and cadd Computes the elementwise product of input and weight, with the shape of the weight "expand" to match the shape of the input. Similarly, perform a expand cdd bias and perform an elementwise add
- T
Numeric type. Only support float/double now
-
class
Select[T] extends TensorModule[T]
A Simple layer selecting an index of the input tensor in the given dimension
A Simple layer selecting an index of the input tensor in the given dimension
- Annotations
- @SerialVersionUID()
-
class
SelectTable[T] extends AbstractModule[Table, Activity, T]
Creates a module that takes a table as input and outputs the element at index
index(positive or negative).Creates a module that takes a table as input and outputs the element at index
index(positive or negative). This can be either a table or a Tensor. The gradients of the non-index elements are zeroed Tensors of the same size. This is true regardless of the depth of the encapsulated Tensor as the function used internally to do so is recursive.- Annotations
- @SerialVersionUID()
-
class
SequenceBeamSearch[T] extends AbstractModule[Table, Activity, T]
Beam search to find the translated sequence with the highest probability.
-
class
Sequential[T] extends DynamicContainer[Activity, Activity, T] with MklInt8Convertible
Sequential provides a means to plug layers together in a feed-forward fully connected manner.
Sequential provides a means to plug layers together in a feed-forward fully connected manner.
- Annotations
- @SerialVersionUID()
-
class
Sigmoid[T] extends TensorModule[T]
Applies the Sigmoid function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.
Applies the Sigmoid function element-wise to the input Tensor, thus outputting a Tensor of the same dimension. Sigmoid is defined as: f(x) = 1 / (1 + exp(-x))
- Annotations
- @SerialVersionUID()
-
class
SmoothL1Criterion[T] extends TensorCriterion[T]
Creates a criterion that can be thought of as a smooth version of the AbsCriterion.
Creates a criterion that can be thought of as a smooth version of the AbsCriterion. It uses a squared term if the absolute element-wise error falls below 1. It is less sensitive to outliers than the MSECriterion and in some cases prevents exploding gradients (e.g. see "Fast R-CNN" paper by Ross Girshick).
| 0.5 * (x_i - y_i)2, if |x_i - y_i| < 1 loss(x, y) = 1/n \sum | | |x_i - y_i| - 0.5, otherwise
If x and y are d-dimensional Tensors with a total of n elements, the sum operation still operates over all the elements, and divides by n. The division by n can be avoided if one sets the internal variable sizeAverage to false
- Annotations
- @SerialVersionUID()
-
class
SmoothL1CriterionWithWeights[T] extends AbstractCriterion[Tensor[T], Table, T]
a smooth version of the AbsCriterion It uses a squared term if the absolute element-wise error falls below 1.
a smooth version of the AbsCriterion It uses a squared term if the absolute element-wise error falls below 1. It is less sensitive to outliers than the MSECriterion and in some cases prevents exploding gradients (e.g. see "Fast R-CNN" paper by Ross Girshick).
d = (x - y) * w_in loss(x, y, w_in, w_out) | 0.5 * (sigma * d_i)^2 * w_out if |d_i| < 1 / sigma / sigma
1/n \sum | | (|d_i| - 0.5 / sigma / sigma) * w_out otherwise
-
class
SoftMarginCriterion[T] extends TensorCriterion[T]
Creates a criterion that optimizes a two-class classification logistic loss between input x (a Tensor of dimension 1) and output y (which is a tensor containing either 1s or -1s).
Creates a criterion that optimizes a two-class classification logistic loss between input x (a Tensor of dimension 1) and output y (which is a tensor containing either 1s or -1s).
loss(x, y) = sum_i (log(1 + exp(-y[i]*x[i]))) / x:nElement()
- Annotations
- @SerialVersionUID()
-
class
SoftMax[T] extends TensorModule[T]
Applies the SoftMax function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0, 1) and sum to 1.
Applies the SoftMax function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0, 1) and sum to 1. Softmax is defined as: f_i(x) = exp(x_i - shift) / sum_j exp(x_j - shift) where shift = max_i(x_i).
- Annotations
- @SerialVersionUID()
-
class
SoftMin[T] extends TensorModule[T]
Applies the SoftMin function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0,1) and sum to 1.
Applies the SoftMin function to an n-dimensional input Tensor, rescaling them so that the elements of the n-dimensional output Tensor lie in the range (0,1) and sum to 1. Softmin is defined as: f_i(x) = exp(-x_i - shift) / sum_j exp(-x_j - shift) where shift = max_i(-x_i).
- Annotations
- @SerialVersionUID()
-
class
SoftPlus[T] extends TensorModule[T]
Apply the SoftPlus function to an n-dimensional input tensor.
Apply the SoftPlus function to an n-dimensional input tensor.
SoftPlus function: f_i(x) = 1/beta * log(1 + exp(beta * x_i))
- Annotations
- @SerialVersionUID()
-
class
SoftShrink[T] extends TensorModule[T]
Apply the soft shrinkage function element-wise to the input Tensor
Apply the soft shrinkage function element-wise to the input Tensor
SoftShrinkage operator: ⎧ x - lambda, if x > lambda f(x) = ⎨ x + lambda, if x < -lambda ⎩ 0, otherwise
- Annotations
- @SerialVersionUID()
-
class
SoftSign[T] extends TensorModule[T]
Apply SoftSign function to an n-dimensional input Tensor.
Apply SoftSign function to an n-dimensional input Tensor.
SoftSign function: f_i(x) = x_i / (1+|x_i|)
- Annotations
- @SerialVersionUID()
-
class
SoftmaxWithCriterion[T] extends TensorCriterion[T]
Computes the multinomial logistic loss for a one-of-many classification task, passing real-valued predictions through a softmax to get a probability distribution over classes.
Computes the multinomial logistic loss for a one-of-many classification task, passing real-valued predictions through a softmax to get a probability distribution over classes. It should be preferred over separate SoftmaxLayer + MultinomialLogisticLossLayer as its gradient computation is more numerically stable.
-
class
SparseJoinTable[T] extends AbstractModule[Table, Tensor[T], T]
:: Experimental ::
:: Experimental ::
Sparse version of JoinTable. Backward just pass the origin gradOutput back to the next layers without split. So this layer may just works in Wide&Deep like models.
- T
Numeric type of parameter(e.g. weight, bias). Only support float/double now
-
class
SparseLinear[T] extends Linear[T]
SparseLinear is the sparse version of module Linear.
SparseLinear is the sparse version of module Linear. SparseLinear has two different from Linear: firstly, SparseLinear's input Tensor is a SparseTensor. Secondly, SparseLinear doesn't backward gradient to next layer in the backpropagation by default, as the gradInput of SparseLinear is useless and very big in most cases.
But, considering model like Wide&Deep, we provide backwardStart and backwardLength to backward part of the gradient to next layer.
-
class
SpatialAveragePooling[T] extends TensorModule[T]
Applies 2D average-pooling operation in kWxkH regions by step size dWxdH steps.
Applies 2D average-pooling operation in kWxkH regions by step size dWxdH steps. The number of output features is equal to the number of input planes.
When padW and padH are both -1, we use a padding algorithm similar to the "SAME" padding of tensorflow. That is
outHeight = Math.ceil(inHeight.toFloat/strideH.toFloat) outWidth = Math.ceil(inWidth.toFloat/strideW.toFloat)
padAlongHeight = Math.max(0, (outHeight - 1) * strideH + kernelH - inHeight) padAlongWidth = Math.max(0, (outWidth - 1) * strideW + kernelW - inWidth)
padTop = padAlongHeight / 2 padLeft = padAlongWidth / 2
- Annotations
- @SerialVersionUID()
-
class
SpatialBatchNormalization[T] extends BatchNormalization[T]
This file implements Batch Normalization as described in the paper: "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" by Sergey Ioffe, Christian Szegedy This implementation is useful for inputs coming from convolution layers.
This file implements Batch Normalization as described in the paper: "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift" by Sergey Ioffe, Christian Szegedy This implementation is useful for inputs coming from convolution layers. For non-convolutional layers, see BatchNormalization The operation implemented is:
( x - mean(x) ) y = -------------------- * gamma + beta standard-deviation(x)
where gamma and beta are learnable parameters. The learning of gamma and beta is optional.
- Annotations
- @SerialVersionUID()
-
class
SpatialContrastiveNormalization[T] extends TensorModule[T]
Subtractive + divisive contrast normalization.
Subtractive + divisive contrast normalization.
- Annotations
- @SerialVersionUID()
-
class
SpatialConvolution[T] extends TensorModule[T] with Initializable with MklInt8Convertible
Applies a 2D convolution over an input image composed of several input planes.
Applies a 2D convolution over an input image composed of several input planes. The input tensor in forward(input) is expected to be a 3D tensor (nInputPlane x height x width).
When padW and padH are both -1, we use a padding algorithm similar to the "SAME" padding of tensorflow. That is
outHeight = Math.ceil(inHeight.toFloat/strideH.toFloat) outWidth = Math.ceil(inWidth.toFloat/strideW.toFloat)
padAlongHeight = Math.max(0, (outHeight - 1) * strideH + kernelH - inHeight) padAlongWidth = Math.max(0, (outWidth - 1) * strideW + kernelW - inWidth)
padTop = padAlongHeight / 2 padLeft = padAlongWidth / 2
- Annotations
- @SerialVersionUID()
-
class
SpatialConvolutionMap[T] extends TensorModule[T]
This class is a generalization of SpatialConvolution.
This class is a generalization of SpatialConvolution. It uses a generic connection table between input and output features. The SpatialConvolution is equivalent to using a full connection table.
- Annotations
- @SerialVersionUID()
-
class
SpatialCrossMapLRN[T] extends TensorModule[T]
Applies Spatial Local Response Normalization between different feature maps.
Applies Spatial Local Response Normalization between different feature maps. The operation implemented is: x_f y_f = ------------------------------------------------- (k+(alpha/size)* sum_{l=l1 to l2} (x_l2))beta
where x_f is the input at spatial locations h,w (not shown for simplicity) and feature map f, l1 corresponds to max(0,f-ceil(size/2)) and l2 to min(F, f-ceil(size/2) + size). Here, F is the number of feature maps.
- Annotations
- @SerialVersionUID()
-
class
SpatialDilatedConvolution[T] extends TensorModule[T] with Initializable
Apply a 2D dilated convolution over an input image.
Apply a 2D dilated convolution over an input image.
The input tensor is expected to be a 3D or 4D(with batch) tensor.
If input is a 3D tensor nInputPlane x height x width, owidth = floor(width + 2 * padW - dilationW * (kW-1) - 1) / dW + 1 oheight = floor(height + 2 * padH - dilationH * (kH-1) - 1) / dH + 1
Reference Paper: Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[J]. arXiv preprint arXiv:1511.07122, 2015.
- Annotations
- @SerialVersionUID()
-
class
SpatialDivisiveNormalization[T] extends TensorModule[T]
Applies a spatial division operation on a series of 2D inputs using kernel for computing the weighted average in a neighborhood.
Applies a spatial division operation on a series of 2D inputs using kernel for computing the weighted average in a neighborhood. The neighborhood is defined for a local spatial region that is the size as kernel and across all features. For an input image, since there is only one feature, the region is only spatial. For an RGB image, the weighted average is taken over RGB channels and a spatial region.
If the kernel is 1D, then it will be used for constructing and separable 2D kernel. The operations will be much more efficient in this case.
The kernel is generally chosen as a gaussian when it is believed that the correlation of two pixel locations decrease with increasing distance. On the feature dimension, a uniform average is used since the weighting across features is not known.
- Annotations
- @SerialVersionUID()
-
class
SpatialDropout1D[T] extends TensorModule[T]
This version performs the same function as Dropout, however it drops entire 1D feature maps instead of individual elements.
This version performs the same function as Dropout, however it drops entire 1D feature maps instead of individual elements. If adjacent frames within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease. In this case, SpatialDropout1D will help promote independence between feature maps and should be used instead.
- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
-
class
SpatialDropout2D[T] extends TensorModule[T]
This version performs the same function as Dropout, however it drops entire 2D feature maps instead of individual elements.
This version performs the same function as Dropout, however it drops entire 2D feature maps instead of individual elements. If adjacent pixels within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease. In this case, SpatialDropout2D will help promote independence between feature maps and should be used instead.
- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
-
class
SpatialDropout3D[T] extends TensorModule[T]
This version performs the same function as Dropout, however it drops entire 3D feature maps instead of individual elements.
This version performs the same function as Dropout, however it drops entire 3D feature maps instead of individual elements. If adjacent voxels within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease. In this case, SpatialDropout3D will help promote independence between feature maps and should be used instead.
- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
-
class
SpatialFullConvolution[T] extends AbstractModule[Activity, Tensor[T], T] with Initializable
Apply a 2D full convolution over an input image.
Apply a 2D full convolution over an input image.
The input tensor is expected to be a 3D or 4D(with batch) tensor. Note that instead of setting adjW and adjH, SpatialFullConvolution[Table, T] also accepts a table input with two tensors: T(convInput, sizeTensor) where convInput is the standard input tensor, and the size of sizeTensor is used to set the size of the output (will ignore the adjW and adjH values used to construct the module). This module can be used without a bias by setting parameter noBias = true while constructing the module.
If input is a 3D tensor nInputPlane x height x width, owidth = (width - 1) * dW - 2*padW + kW + adjW oheight = (height - 1) * dH - 2*padH + kH + adjH
Other frameworks call this operation "In-network Upsampling", "Fractionally-strided convolution", "Backwards Convolution," "Deconvolution", or "Upconvolution."
Reference Paper: Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.
- Annotations
- @SerialVersionUID()
-
class
SpatialMaxPooling[T] extends TensorModule[T]
Applies 2D max-pooling operation in kWxkH regions by step size dWxdH steps.
Applies 2D max-pooling operation in kWxkH regions by step size dWxdH steps. The number of output features is equal to the number of input planes. If the input image is a 3D tensor nInputPlane x height x width, the output image size will be nOutputPlane x oheight x owidth where owidth = op((width + 2*padW - kW) / dW + 1) oheight = op((height + 2*padH - kH) / dH + 1) op is a rounding operator. By default, it is floor. It can be changed by calling :ceil() or :floor() methods.
When padW and padH are both -1, we use a padding algorithm similar to the "SAME" padding of tensorflow. That is
outHeight = Math.ceil(inHeight.toFloat/strideH.toFloat) outWidth = Math.ceil(inWidth.toFloat/strideW.toFloat)
padAlongHeight = Math.max(0, (outHeight - 1) * strideH + kernelH - inHeight) padAlongWidth = Math.max(0, (outWidth - 1) * strideW + kernelW - inWidth)
padTop = padAlongHeight / 2 padLeft = padAlongWidth / 2
- Annotations
- @SerialVersionUID()
-
class
SpatialSeparableConvolution[T] extends AbstractModule[Tensor[T], Tensor[T], T]
Separable convolutions consist in first performing a depthwise spatial convolution (which acts on each input channel separately) followed by a pointwise convolution which mixes together the resulting output channels.
Separable convolutions consist in first performing a depthwise spatial convolution (which acts on each input channel separately) followed by a pointwise convolution which mixes together the resulting output channels. The depthMultiplier argument controls how many output channels are generated per input channel in the depthwise step.
- T
module parameter numeric type
-
class
SpatialShareConvolution[T] extends SpatialConvolution[T]
- Annotations
- @SerialVersionUID()
-
class
SpatialSubtractiveNormalization[T] extends TensorModule[T]
Applies a spatial subtraction operation on a series of 2D inputs using kernel for computing the weighted average in a neighborhood.
Applies a spatial subtraction operation on a series of 2D inputs using kernel for computing the weighted average in a neighborhood. The neighborhood is defined for a local spatial region that is the size as kernel and across all features. For a an input image, since there is only one feature, the region is only spatial. For an RGB image, the weighted average is taken over RGB channels and a spatial region.
If the kernel is 1D, then it will be used for constructing and separable 2D kernel. The operations will be much more efficient in this case.
The kernel is generally chosen as a gaussian when it is believed that the correlation of two pixel locations decrease with increasing distance. On the feature dimension, a uniform average is used since the weighting across features is not known.
- Annotations
- @SerialVersionUID()
-
class
SpatialWithinChannelLRN[T] extends TensorModule[T]
The local response normalization layer performs a kind of “lateral inhibition” by normalizing over local input regions.
The local response normalization layer performs a kind of “lateral inhibition” by normalizing over local input regions. the local regions extend spatially, in separate channels (i.e., they have shape 1 x size x size).
- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
-
class
SpatialZeroPadding[T] extends TensorModule[T]
Each feature map of a given input is padded with specified number of zeros.
Each feature map of a given input is padded with specified number of zeros. If padding values are negative, then input is cropped.
- Annotations
- @SerialVersionUID()
-
class
SplitTable[T] extends AbstractModule[Tensor[T], Table, T]
Creates a module that takes a Tensor as input and outputs several tables, splitting the Tensor along the specified dimension
dimension.Creates a module that takes a Tensor as input and outputs several tables, splitting the Tensor along the specified dimension
dimension. Please note the dimension starts from 1.The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user needs to specify the number of dimensions of each sample tensor in a batch using
nInputDims.- T
Numeric type. Only support float/double now
- Annotations
- @SerialVersionUID()
-
class
Sqrt[T] extends Power[T]
Apply an element-wise sqrt operation.
Apply an element-wise sqrt operation.
- Annotations
- @SerialVersionUID()
-
class
Square[T] extends Power[T]
Apply an element-wise square operation.
Apply an element-wise square operation.
- Annotations
- @SerialVersionUID()
-
class
Squeeze[T] extends AbstractModule[Tensor[_], Tensor[_], T]
Delete all singleton dimensions or a specific singleton dimension.
Delete all singleton dimensions or a specific singleton dimension.
- Annotations
- @SerialVersionUID()
-
class
StaticGraph[T] extends Graph[T]
A graph container.
A graph container. The modules in the container are connected as a DAG graph.
- T
Numeric type. Only support float/double now
-
class
Sum[T] extends TensorModule[T]
It is a simple layer which applies a sum operation over the given dimension.
It is a simple layer which applies a sum operation over the given dimension. When nInputDims is provided, the input will be considered as a batches. Then the sum operation will be applied in (dimension + 1)
The input to this layer is expected to be a tensor, or a batch of tensors; when using mini-batch, a batch of sample tensors will be passed to the layer and the user need to specify the number of dimensions of each sample tensor in the batch using
nInputDims.- Annotations
- @SerialVersionUID()
-
class
TableOperation[T] extends AbstractModule[Table, Tensor[T], T]
When two tensors have different size, firstly expand small size tensor to large size tensor, and then do table operation.
-
class
Tanh[T] extends TensorModule[T]
Applies the Tanh function element-wise to the input Tensor, thus outputting a Tensor of the same dimension.
Applies the Tanh function element-wise to the input Tensor, thus outputting a Tensor of the same dimension. Tanh is defined as f(x) = (exp(x)-exp(-x))/(exp(x)+exp(-x)).
- Annotations
- @SerialVersionUID()
-
class
TanhShrink[T] extends TensorModule[T]
A simple layer for each element of the input tensor, do the following operation during the forward process: [f(x) = tanh(x) - 1]
A simple layer for each element of the input tensor, do the following operation during the forward process: [f(x) = tanh(x) - 1]
- Annotations
- @SerialVersionUID()
-
class
TemporalConvolution[T] extends TensorModule[T] with Initializable
Applies a 1D convolution over an input sequence composed of nInputFrame frames..
Applies a 1D convolution over an input sequence composed of nInputFrame frames.. The input tensor in
forward(input)is expected to be a 2D tensor (nInputFramexinputFrameSize) or a 3D tensor (nBatchFramexnInputFramexinputFrameSize).- T
The numeric type in the criterion, usually which are Float or Double
-
class
TemporalMaxPooling[T] extends TensorModule[T]
Applies 1D max-pooling operation in kW regions by step size dW steps.
Applies 1D max-pooling operation in kW regions by step size dW steps. Input sequence composed of nInputFrame frames. The input tensor in forward(input) is expected to be a 2D tensor (nInputFrame x inputFrameSize) or a 3D tensor (nBatchFrame x nInputFrame x inputFrameSize).
If the input sequence is a 2D tensor of dimension nInputFrame x inputFrameSize, the output sequence will be nOutputFrame x inputFrameSize where
nOutputFrame = (nInputFrame - kW) / dW + 1
- T
The numeric type in the criterion, usually which are Float or Double
-
class
TensorTree[T] extends Serializable
TensorTree class is used to decode a tensor to a tree structure.
TensorTree class is used to decode a tensor to a tree structure. The given input
contentis a tensor which encodes a constituency parse tree. The tensor should have the following structure:Each row of the tensor represents a tree node and the row number is node number For each row, except the last column, all other columns represent the children node number of this node. Assume the value of a certain column of the row is not zero, the value
pmeans this node has a child whose node number isp(lies in thep-th) row. Each leaf has a leaf number, in the tensor, the last column represents the leaf number. Each leaf does not have any children, so all the columns of a leaf except the last should be zero. If a node is the root, the last column should equal to-1.Note: if any row for padding, the padding rows should be placed at the last rows with all elements equal to
-1.eg. a tensor represents a binary tree:
[11, 10, -1; 0, 0, 1; 0, 0, 2; 0, 0, 3; 0, 0, 4; 0, 0, 5; 0, 0, 6; 4, 5, 0; 6, 7, 0; 8, 9, 0; 2, 3, 0; -1, -1, -1; -1, -1, -1]
- T
Numeric type Float or Double
-
class
Threshold[T] extends TensorModule[T]
Threshold input Tensor.
Threshold input Tensor. If values in the Tensor smaller than th, then replace it with v
- Annotations
- @SerialVersionUID()
-
class
Tile[T] extends TensorModule[T]
Tile repeats input
nFeaturestimes along itsdimdimensionTile repeats input
nFeaturestimes along itsdimdimension- Annotations
- @SerialVersionUID()
-
class
TimeDistributed[T] extends TensorModule[T]
This layer is intended to apply contained layer to each temporal time slice of input tensor.
This layer is intended to apply contained layer to each temporal time slice of input tensor.
For instance, The TimeDistributed Layer can feed each time slice of input tensor to the Linear layer.
The input data format is [Batch, Time, Other dims]. For the contained layer, it must not change the Other dims length.
- T
data type, which can be Double or Float
-
class
TimeDistributedCriterion[T] extends TensorCriterion[T]
This class is intended to support inputs with 3 or more dimensions.
This class is intended to support inputs with 3 or more dimensions. Apply Any Provided Criterion to every temporal slice of an input.
-
class
TimeDistributedMaskCriterion[T] extends TensorCriterion[T]
This class is intended to support inputs with 3 or more dimensions.
This class is intended to support inputs with 3 or more dimensions. Apply Any Provided Criterion to every temporal slice of an input. In addition, it supports padding mask.
eg. if the target is [ [-1, 1, 2, 3, -1], [5, 4, 3, -1, -1] ], and set the paddingValue property to -1, then the loss of -1 would not be accumulated and the loss is only divided by 6 (ont including the amount of -1, in this case, we are only interested in 1, 2, 3, 5, 4, 3)
-
class
Transformer[T] extends AbstractModule[Activity, Activity, T]
Transformer model from "Attention Is All You Need".
Transformer model from "Attention Is All You Need". The Transformer model consists of an encoder and a decoder, both are stacks of self-attention layers followed by feed-forward layers. This model yields good results on a number of problems, especially in NLP and machine translation. See "Attention Is All You Need" (https://arxiv.org/abs/1706.03762) for the full description of the model and the results obtained with its early version.
- T
The numeric type in this module parameters.
-
class
TransformerCriterion[T] extends AbstractCriterion[Activity, Activity, T]
The criterion that takes two modules to transform input and target, and take one criterion to compute the loss with the transformed input and target.
The criterion that takes two modules to transform input and target, and take one criterion to compute the loss with the transformed input and target.
This criterion can be used to construct complex criterion. For example, the
inputTransformerandtargetTransformercan be pre-trained CNN networks, and we can use the networks' output to calculate the high-level feature reconstruction loss, which is commonly used in areas like neural style transfer (https://arxiv.org/abs/1508.06576), texture synthesis (https://arxiv.org/abs/1505.07376), .etc.- T
The numeric type in the criterion, usually which are Float or Double
- sealed trait TransformerType extends AnyRef
-
class
Transpose[T] extends AbstractModule[Tensor[_], Tensor[_], T]
Transpose input along specified dimensions
Transpose input along specified dimensions
- Annotations
- @SerialVersionUID()
- abstract class TreeLSTM[T] extends AbstractModule[Table, Tensor[T], T]
-
class
Unsqueeze[T] extends AbstractModule[Tensor[_], Tensor[_], T]
Insert singleton dim (i.e., dimension 1) at position array pos.
Insert singleton dim (i.e., dimension 1) at position array pos. For an input with dim = input.dim(), there are dim + 1 possible positions to insert the singleton dimension. Dimension index are 1-based. 0 and negative pos correspond to unsqueeze() applied at pos = pos + input.dim() + 1
- Annotations
- @SerialVersionUID()
-
class
UpSampling1D[T] extends TensorModule[T]
Upsampling layer for 1D inputs.
Upsampling layer for 1D inputs. Repeats each temporal step length times along the time axis.
If input's size is (batch, steps, features), then the output's size is (batch, steps * length, features)
- T
The numeric type in this module, usually which are Float or Double
-
class
UpSampling2D[T] extends TensorModule[T]
Upsampling layer for 2D inputs.
Upsampling layer for 2D inputs. Repeats the heights and widths of the data by size(0) and size(1) respectively.
If input's dataformat is NCHW, then the size of output is (N, C, H * size(0), W * size(1))
- T
The numeric type in the criterion, usually which are Float or Double
-
class
UpSampling3D[T] extends TensorModule[T]
Upsampling layer for 3D inputs.
Upsampling layer for 3D inputs. Repeats the 1st, 2nd and 3rd dimensions of the data by size[0], size[1] and size[2] respectively. The input data is assumed to be of the form
minibatch x channels x depth x height x width.- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
-
trait
VariableFormat extends AnyRef
VariableFormat describe the meaning of each dimension of the variable (the trainable parameters of a model like weight and bias) and can be used to return the fan in and fan out size of the variable when provided the variable shape.
-
class
View[T] extends TensorModule[T]
This module creates a new view of the input tensor using the sizes passed to the constructor.
This module creates a new view of the input tensor using the sizes passed to the constructor. The method setNumInputDims() allows to specify the expected number of dimensions of the inputs of the modules. This makes it possible to use minibatch inputs when using a size -1 for one of the dimensions.
- Annotations
- @SerialVersionUID()
-
class
VolumetricAveragePooling[T] extends TensorModule[T]
Applies 3D average-pooling operation in kTxkWxkH regions by step size dTxdWxdH.
Applies 3D average-pooling operation in kTxkWxkH regions by step size dTxdWxdH. The number of output features is equal to the number of input planes / dT. The input can optionally be padded with zeros. Padding should be smaller than half of kernel size. That is, padT < kT/2, padW < kW/2 and padH < kH/2
- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
-
class
VolumetricConvolution[T] extends TensorModule[T] with Initializable
Applies a 3D convolution over an input image composed of several input planes.
Applies a 3D convolution over an input image composed of several input planes. The input tensor in forward(input) is expected to be a 4D tensor (nInputPlane x time x height x width).
- T
The numeric type in the criterion, usually which are Float or Double
-
class
VolumetricFullConvolution[T] extends AbstractModule[Activity, Tensor[T], T] with Initializable
Apply a 3D full convolution over an 3D input image, a sequence of images, or a video etc.
Apply a 3D full convolution over an 3D input image, a sequence of images, or a video etc. The input tensor is expected to be a 4D or 5D(with batch) tensor. Note that instead of setting adjT, adjW and adjH, VolumetricConvolution also accepts a table input with two tensors: T(convInput, sizeTensor) where convInput is the standard input tensor, and the size of sizeTensor is used to set the size of the output (will ignore the adjT, adjW and adjH values used to construct the module). This module can be used without a bias by setting parameter noBias = true while constructing the module.
If input is a 4D tensor nInputPlane x depth x height x width, odepth = (depth - 1) * dT - 2*padT + kT + adjT owidth = (width - 1) * dW - 2*padW + kW + adjW oheight = (height - 1) * dH - 2*padH + kH + adjH
Other frameworks call this operation "In-network Upsampling", "Fractionally-strided convolution", "Backwards Convolution," "Deconvolution", or "Upconvolution."
Reference Paper: Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.
- Annotations
- @SerialVersionUID()
-
class
VolumetricMaxPooling[T] extends TensorModule[T]
Applies 3D max-pooling operation in kTxkWxkH regions by step size dTxdWxdH.
Applies 3D max-pooling operation in kTxkWxkH regions by step size dTxdWxdH. The number of output features is equal to the number of input planes / dT. The input can optionally be padded with zeros. Padding should be smaller than half of kernel size. That is, padT < kT/2, padW < kW/2 and padH < kH/2
- T
The numeric type in the criterion, usually which are Float or Double
- Annotations
- @SerialVersionUID()
Value Members
- object Abs extends Serializable
- object AbsCriterion extends Serializable
- object ActivityRegularization extends Serializable
- object Add extends Serializable
- object AddConstant extends Serializable
- object Anchor extends Serializable
- object Attention extends Serializable
- object BCECriterion extends Serializable
- object BatchNormalization extends ModuleSerializable with Serializable
- object BiRecurrent extends ContainerSerializable with Serializable
- object BifurcateSplitTable extends Serializable
- object Bilinear extends Serializable
-
object
BilinearFiller extends InitializationMethod with Product with Serializable
Initialize the weight with coefficients for bilinear interpolation.
Initialize the weight with coefficients for bilinear interpolation.
A common use case is with the DeconvolutionLayer acting as upsampling. The variable tensor passed in the init function should have 5 dimensions of format [nGroup, nInput, nOutput, kH, kW], and kH should be equal to kW
- object BinaryThreshold extends Serializable
- object BinaryTreeLSTM extends ModuleSerializable with Serializable
- object Bottle extends Serializable
- object BoxHead extends Serializable
- object CAdd extends Serializable
- object CAddTable extends ModuleSerializable with Serializable
- object CAveTable extends Serializable
- object CDivTable extends Serializable
- object CMaxTable extends Serializable
- object CMinTable extends Serializable
- object CMul extends Serializable
- object CMulTable extends Serializable
- object CMulTableExpand
- object CSubTable extends Serializable
- object CSubTableExpand
- object CategoricalCrossEntropy extends Serializable
- object CellSerializer extends ModuleSerializable
- object Clamp extends Serializable
- object ClassNLLCriterion extends Serializable
- object ClassSimplexCriterion extends Serializable
- object Concat extends Serializable
- object ConcatTable extends Serializable
- object Contiguous extends Serializable
- object ConvLSTMPeephole extends Serializable
- object ConvLSTMPeephole3D extends Serializable
- object Cosine extends Serializable
- object CosineDistance extends Serializable
- object CosineDistanceCriterion extends Serializable
- object CosineEmbeddingCriterion extends Serializable
- object CosineProximityCriterion extends Serializable
- object Cropping2D extends Serializable
- object Cropping3D extends Serializable
- object CrossEntropyCriterion extends Serializable
- object CrossProduct extends Serializable
- object DenseToSparse extends Serializable
- object DetectionOutputFrcnn extends Serializable
- object DetectionOutputSSD extends Serializable
- object DiceCoefficientCriterion extends Serializable
- object DistKLDivCriterion extends Serializable
- object DotProduct extends Serializable
- object DotProductCriterion extends Serializable
- object Dropout extends Serializable
- object ELU extends Serializable
- object Echo extends ModuleSerializable with Serializable
- object ErrorInfo
- object Euclidean extends Serializable
- object Exp extends Serializable
- object ExpandSize extends Serializable
- object FPN extends Serializable
- object FeedForwardNetwork extends Serializable
- object FlattenTable extends Serializable
- object FrameManager extends Serializable
- object GRU extends Serializable
- object GaussianCriterion extends Serializable
- object GaussianDropout extends Serializable
- object GaussianNoise extends Serializable
- object GaussianSampler extends Serializable
- object GradientReversal extends Serializable
- object Graph extends GraphSerializable with Serializable
- object HardShrink extends Serializable
- object HardSigmoid extends Serializable
- object HardTanh extends Serializable
- object Highway
- object HingeEmbeddingCriterion extends Serializable
- object Identity extends Serializable
- object Index extends Serializable
- object InferReshape extends Serializable
- object Input extends Serializable
- object JoinTable extends Serializable
- object KLDCriterion extends Serializable
- object KullbackLeiblerDivergenceCriterion extends Serializable
- object L1Cost extends Serializable
- object L1HingeEmbeddingCriterion extends Serializable
- object L1Penalty extends Serializable
- object LSTM extends Serializable
- object LSTMPeephole extends Serializable
- object LanguageModel extends TransformerType with Product with Serializable
- object LeakyReLU extends Serializable
- object Linear extends Quantizable with Serializable
- object LocallyConnected1D extends Serializable
- object LocallyConnected2D extends Serializable
- object Log extends Serializable
- object LogSigmoid extends Serializable
- object LogSoftMax extends Serializable
- object LookupTable extends Serializable
- object LookupTableSparse extends Serializable
- object MM extends Serializable
- object MSECriterion extends Serializable
- object MV extends Serializable
- object MapTable extends ContainerSerializable with Serializable
- object MarginCriterion extends Serializable
- object MarginRankingCriterion extends Serializable
- object MaskHead extends Serializable
- object MaskedSelect extends ModuleSerializable with Serializable
- object Masking extends Serializable
- object Max extends Serializable
- object Maxout extends ModuleSerializable with Serializable
- object Mean extends Serializable
- object MeanAbsolutePercentageCriterion extends Serializable
- object MeanSquaredLogarithmicCriterion extends Serializable
- object Min extends Serializable
- object MixtureTable extends Serializable
- object Module
- object Mul extends Serializable
- object MulConstant extends Serializable
- object MultiCriterion extends Serializable
- object MultiLabelMarginCriterion extends Serializable
- object MultiLabelSoftMarginCriterion extends Serializable
- object MultiMarginCriterion extends Serializable
- object MultiRNNCell extends ModuleSerializable with Serializable
- object Narrow extends Serializable
- object NarrowTable extends Serializable
- object Negative extends Serializable
- object NegativeEntropyPenalty extends Serializable
- object NormMode extends Enumeration
- object Normalize extends Serializable
- object NormalizeScale extends Serializable
-
object
Ones extends InitializationMethod with Product with Serializable
Initializer that generates tensors with zeros.
- object PGCriterion extends Serializable
- object PReLU extends Serializable
- object Pack extends Serializable
- object Padding extends Serializable
- object PairwiseDistance extends Serializable
- object ParallelCriterion extends Serializable
- object ParallelTable extends Serializable
- object PoissonCriterion extends Serializable
- object Pooler extends Serializable
- object Power extends Serializable
- object PriorBox extends Serializable
- object Proposal extends Serializable
- object RReLU extends Serializable
-
object
RandomUniform extends InitializationMethod with Product with Serializable
Initializer that generates tensors with a uniform distribution.
Initializer that generates tensors with a uniform distribution.
It draws samples from a uniform distribution within [-limit, limit] where "limit" is "1/sqrt(fan_in)"
- object ReLU extends Serializable
- object ReLU6 extends Serializable
- object Recurrent extends ContainerSerializable with Serializable
- object RecurrentDecoder extends ContainerSerializable with Serializable
- object RegionProposal extends Serializable
- object Replicate extends Serializable
- object Reshape extends ModuleSerializable with Serializable
- object ResizeBilinear extends Serializable
- object Reverse extends Serializable
- object RnnCell extends Serializable
- object RoiAlign extends Serializable
- object RoiPooling extends Serializable
- object SReLU extends ModuleSerializable with Serializable
- object Scale extends ModuleSerializable with Serializable
- object Select extends Serializable
- object SelectTable extends Serializable
- object SequenceBeamSearch extends Serializable
- object Sequential extends Serializable
- object Sigmoid extends Serializable
- object SmoothL1Criterion extends Serializable
- object SmoothL1CriterionWithWeights extends Serializable
- object SoftMarginCriterion extends Serializable
- object SoftMax extends Serializable
- object SoftMin extends Serializable
- object SoftPlus extends Serializable
- object SoftShrink extends Serializable
- object SoftSign extends Serializable
- object SoftmaxWithCriterion extends Serializable
- object SparseJoinTable extends Serializable
- object SparseLinear extends Serializable
- object SpatialAveragePooling extends Serializable
- object SpatialBatchNormalization extends Serializable
- object SpatialContrastiveNormalization extends ModuleSerializable with Serializable
- object SpatialConvolution extends Quantizable with Serializable
- object SpatialConvolutionMap extends Serializable
- object SpatialCrossMapLRN extends Serializable
- object SpatialDilatedConvolution extends Quantizable with Serializable
- object SpatialDivisiveNormalization extends ModuleSerializable with Serializable
- object SpatialDropout1D extends Serializable
- object SpatialDropout2D extends Serializable
- object SpatialDropout3D extends Serializable
- object SpatialFullConvolution extends ModuleSerializable with Serializable
- object SpatialMaxPooling extends ModuleSerializable with Serializable
- object SpatialSeparableConvolution extends ModuleSerializable with Serializable
- object SpatialShareConvolution extends Serializable
- object SpatialSubtractiveNormalization extends ModuleSerializable with Serializable
- object SpatialWithinChannelLRN extends Serializable
- object SpatialZeroPadding extends Serializable
- object SplitTable extends Serializable
- object Sqrt extends Serializable
- object Square extends Serializable
- object Squeeze extends Serializable
- object Sum extends Serializable
- object Tanh extends Serializable
- object TanhShrink extends Serializable
- object TemporalConvolution extends Serializable
- object TemporalMaxPooling extends Serializable
- object Threshold extends Serializable
- object Tile extends Serializable
- object TimeDistributed extends ModuleSerializable with Serializable
- object TimeDistributedCriterion extends Serializable
- object TimeDistributedMaskCriterion extends Serializable
- object Transformer extends ModuleSerializable with Serializable
- object TransformerCriterion extends Serializable
- object Translation extends TransformerType with Product with Serializable
- object Transpose extends ModuleSerializable with Serializable
- object Unsqueeze extends Serializable
- object UpSampling1D extends Serializable
- object UpSampling2D extends Serializable
- object UpSampling3D extends Serializable
- object Utils
- object VariableFormat
- object View extends Serializable
- object VolumetricAveragePooling extends ModuleSerializable with Serializable
- object VolumetricConvolution extends Serializable
- object VolumetricFullConvolution extends Serializable
- object VolumetricMaxPooling extends ModuleSerializable with Serializable
-
object
Xavier extends InitializationMethod with Product with Serializable
In short, it helps signals reach deep into the network.
In short, it helps signals reach deep into the network.
During the training process of deep nn:
- If the weights in a network start are too small, then the signal shrinks as it passes through each layer until it’s too tiny to be useful.
2. If the weights in a network start too large, then the signal grows as it passes through each layer until it’s too massive to be useful.
Xavier initialization makes sure the weights are ‘just right’, keeping the signal in a reasonable range of values through many layers.
More details on the paper [Understanding the difficulty of training deep feedforward neural networks] (http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf)
-
object
Zeros extends InitializationMethod with Product with Serializable
Initializer that generates tensors with zeros.