Class CuDNNFunctionOptimizations.CudnnConv2dNCHWtoNHWCConversion

  • All Implemented Interfaces:
    Optimizer
    Enclosing class:
    CuDNNFunctionOptimizations

    public static class CuDNNFunctionOptimizations.CudnnConv2dNCHWtoNHWCConversion
    extends Object
    implements Optimizer
    https://docs.nvidia.com/deeplearning/sdk/dl-performance-guide/index.html#tensor-layout For tensor cores: we want NHWC layout: Section 7.3.1 "Layout choice has an effect on performance, as convolutions implemented for Tensor Cores require NHWC layout and are fastest when input tensors are laid out in NHWC." "To maximize performance, we recommend using NHWC tensor layout." As for weights format: cuDNN docs are vague - but TF uses NCHW+OIHW or NHWC+OHWI
    • Constructor Detail

      • CudnnConv2dNCHWtoNHWCConversion

        public CudnnConv2dNCHWtoNHWCConversion()
    • Method Detail

      • checkAndApply

        public boolean checkAndApply​(SameDiff sd,
                                     OptimizationHelper helper,
                                     SameDiffOp op,
                                     ArrayHolder constantArrays,
                                     ArrayHolder variablesArrays)
        Specified by:
        checkAndApply in interface Optimizer
        Parameters:
        sd - Current SameDiff instance to optimize
        helper - Helper class for optimization
        op - Operation to check for optimization
        constantArrays - Array holder for constant arrays
        variablesArrays - Array holder for variable arrays
        Returns:
        True if the optimization was applied