| Class | Description |
|---|---|
| BaseOptimizerSet | |
| ConstantFunctionOptimizations |
This set of optimizations looks for functions that are applied to constants, and "pre executes" them, so they don't have
to be calculated (returning the same value) on each run.
|
| ConstantFunctionOptimizations.FoldConstantFunctions | |
| CuDNNFunctionOptimizations | |
| CuDNNFunctionOptimizations.CudnnConv2dNCHWtoNHWCConversion |
https://docs.nvidia.com/deeplearning/sdk/dl-performance-guide/index.html#tensor-layout
For tensor cores: we want NHWC layout:
Section 7.3.1
"Layout choice has an effect on performance, as convolutions implemented for Tensor Cores require NHWC layout and are fastest when input tensors are laid out in NHWC."
"To maximize performance, we recommend using NHWC tensor layout."
As for weights format: cuDNN docs are vague - but TF uses NCHW+OIHW or NHWC+OHWI
|
| IdentityFunctionOptimizations | |
| IdentityFunctionOptimizations.RemoveIdentityOps |
Remove identity(x)
|
| IdentityFunctionOptimizations.RemoveIdentityPermute |
Remove permute(0,1,2,...,rank-1) as this is a no-op
|
| OptimizationUtils | |
| ShapeFunctionOptimizations | |
| ShapeFunctionOptimizations.FuseChainedConcatOps |
Fuse [concat(concat(concat(x,y,dim=D), z, dim=D), a, dim=D)] into a single concat op, concat(x,y,z,a, dim=D)
As long as the intermediate outputs aren't needed elsewhere
|
| ShapeFunctionOptimizations.FuseChainedPermutes |
Fuse [permute1 -> permute2 -> ...
|
| ShapeFunctionOptimizations.FuseChainedReshapes |
Fuse [reshape1 -> reshape2 -> ...
|
| UnusedFunctionOptimizations | |
| UnusedFunctionOptimizations.RemoveUnusedConstants |
Copyright © 2022. All rights reserved.