A framework to accelerate INT8 convolution in Synet Framework. More...
Functions | |
| SIMD_API void * | SimdSynetConvolution8iInit (size_t batch, const SimdConvolutionParameters *conv, SimdSynetCompatibilityType compatibility) |
| Initializes an INT8 convolution context. More... | |
| SIMD_API size_t | SimdSynetConvolution8iExternalBufferSize (const void *context) |
| Gets the size in bytes of caller-provided temporary buffer for INT8 convolution. More... | |
| SIMD_API size_t | SimdSynetConvolution8iInternalBufferSize (const void *context) |
| Gets the size in bytes of internal storage used by an INT8 convolution context. More... | |
| SIMD_API const char * | SimdSynetConvolution8iInfo (const void *context) |
| Gets a short description of the selected INT8 convolution implementation. More... | |
| SIMD_API void | SimdSynetConvolution8iSetParams (void *context, const float *weight, const float *bias, const float *params, const float *const *stats) |
| Sets weights, bias, activation parameters and tensor statistics for INT8 convolution. More... | |
| SIMD_API void | SimdSynetConvolution8iForward (void *context, const uint8_t *src, uint8_t *buf, uint8_t *dst) |
| Performs forward propagation of INT8 convolution. More... | |
Detailed Description
A framework to accelerate INT8 convolution in Synet Framework.
Function Documentation
◆ SimdSynetConvolution8iInit()
| void * SimdSynetConvolution8iInit | ( | size_t | batch, |
| const SimdConvolutionParameters * | conv, | ||
| SimdSynetCompatibilityType | compatibility | ||
| ) |
Initializes an INT8 convolution context.
The function validates convolution parameters and chooses a suitable implementation (GEMM, NHWC direct, NHWC depthwise or architecture-specific VNNI/AMX/NEON variant when available). It supports FP32 or UINT8 source and destination tensors with matching NCHW or NHWC format. The destination spatial size must match convolution parameters:
dstH = (srcH + padY + padH - (dilationY*(kernelY - 1) + 1)) / strideY + 1 dstW = (srcW + padX + padW - (dilationX*(kernelX - 1) + 1)) / strideX + 1
A created context stores tensor shape, data types, format, convolution geometry, group count, activation type and compatibility flags. FP32 weights, bias, activation parameters and tensor statistics are attached later by SimdSynetConvolution8iSetParams.
- Parameters
-
[in] batch - a batch size. [in] conv - a pointer to convolution parameters. Source and destination tensor types must be FP32 or UINT8. [in] compatibility - calculation compatibility flags. They select precise, overflow or narrowed INT8 calculation mode. Narrowed mode uses unsigned range [0, 180] and signed range [-90, 90]; otherwise ranges are [0, 255] and [-128, 127].
- Returns
- a pointer to INT8 convolution context. On error it returns NULL. It must be released with using of function SimdRelease. This pointer is used in functions SimdSynetConvolution8iExternalBufferSize, SimdSynetConvolution8iInternalBufferSize, SimdSynetConvolution8iInfo, SimdSynetConvolution8iSetParams and SimdSynetConvolution8iForward.
◆ SimdSynetConvolution8iExternalBufferSize()
| size_t SimdSynetConvolution8iExternalBufferSize | ( | const void * | context | ) |
Gets the size in bytes of caller-provided temporary buffer for INT8 convolution.
The returned value is a number of bytes. It depends on the implementation selected during initialization and can be used to allocate the buf argument of SimdSynetConvolution8iForward. The buffer can contain temporary UINT8 source conversion data, im2col/padded input data, INT32 sums and temporary FP32 output data.
- Parameters
-
[in] context - a pointer to INT8 convolution context. It must be created by function SimdSynetConvolution8iInit and released by function SimdRelease.
- Returns
- a number of bytes required for external temporary buffer.
◆ SimdSynetConvolution8iInternalBufferSize()
| size_t SimdSynetConvolution8iInternalBufferSize | ( | const void * | context | ) |
Gets the size in bytes of internal storage used by an INT8 convolution context.
The returned value reports internal storage tracked by the selected implementation, including internal temporary buffers, quantized/reordered INT8 weights, source and destination conversion parameters, normalization, bias and activation parameters.
- Parameters
-
[in] context - a pointer to INT8 convolution context. It must be created by function SimdSynetConvolution8iInit and released by function SimdRelease.
- Returns
- a number of bytes used by internal buffers.
◆ SimdSynetConvolution8iInfo()
| const char * SimdSynetConvolution8iInfo | ( | const void * | context | ) |
Gets a short description of the selected INT8 convolution implementation.
The returned string contains the implementation extension and algorithm name, for example a GEMM, NHWC direct or NHWC depthwise variant, with a suffix for precise, overflow or narrowed mode when applicable. The returned pointer is owned by the context and remains valid until the next call of this function for the same context or until the context is released.
- Parameters
-
[in] context - a pointer to INT8 convolution context. It must be created by function SimdSynetConvolution8iInit and released by function SimdRelease.
- Returns
- a string with description of internal implementation of INT8 convolution algorithm.
◆ SimdSynetConvolution8iSetParams()
| void SimdSynetConvolution8iSetParams | ( | void * | context, |
| const float * | weight, | ||
| const float * | bias, | ||
| const float * | params, | ||
| const float *const * | stats | ||
| ) |
Sets weights, bias, activation parameters and tensor statistics for INT8 convolution.
This function must be called before SimdSynetConvolution8iForward. The weight array contains FP32 convolution weights with kernelY*kernelX*srcC*dstC/group elements. Source statistics (stats[0], stats[1], each with srcC elements) define per-channel source quantization parameters; destination statistics (stats[2], stats[3], each with dstC elements) define per-channel output quantization parameters. The selected implementation converts weights to INT8, may reorder them, and computes per-output-channel normalization and bias terms used to convert INT32 sums back to FP32. Activation parameters are copied or expanded internally according to SimdConvolutionActivationType.
- Parameters
-
[in,out] context - a pointer to INT8 convolution context. It must be created by function SimdSynetConvolution8iInit and released by function SimdRelease. [in] weight - a pointer to FP32 convolution weights. [in] bias - a pointer to FP32 bias array with dstC elements. Can be NULL. [in] params - a pointer to FP32 parameters of activation function (see SimdConvolutionActivationType). Can be NULL when activation does not require parameters. [in] stats - a pointer to pointers with per-channel tensor statistics: source minimum stats[0], source maximum stats[1], destination minimum stats[2], destination maximum stats[3].
◆ SimdSynetConvolution8iForward()
| void SimdSynetConvolution8iForward | ( | void * | context, |
| const uint8_t * | src, | ||
| uint8_t * | buf, | ||
| uint8_t * | dst | ||
| ) |
Performs forward propagation of INT8 convolution.
The function converts FP32 input to UINT8 when the context source type is FP32, uses UINT8 input directly when the source type is UINT8, accumulates convolution sums in INT32 with INT8 weights, converts sums to FP32 using internal normalization and bias, applies activation, and writes FP32 or UINT8 output according to the context destination type:
if(srcT == SimdTensorData32f)
src8u = restrict(round(src32f*srcScale[c] + srcShift[c]), srcLower, srcUpper);
sum = convolution_int32(src8u, weight8i, zero);
value = Activate(sum*norm[dc] + bias[dc], activation, params);
dst[outputOffset] = dstT == SimdTensorData8u ?
restrict(round(value*dstScale[dc] + dstShift[dc]), dstLower, dstUpper) : value;
The exact offsets depend on tensor format, padding, dilation, stride and group. The input and output tensors use the shape, data types and format from the context created by SimdSynetConvolution8iInit.
- Parameters
-
[in] context - a pointer to INT8 convolution context. It must be created by function SimdSynetConvolution8iInit and released by function SimdRelease. [in] src - a pointer to input tensor. Actual element type is defined by srcT in convolution parameters. [out] buf - a pointer to external temporary byte buffer. The required size is determined by function SimdSynetConvolution8iExternalBufferSize. Can be NULL (it causes usage of internal buffer). [out] dst - a pointer to output tensor. Actual element type is defined by dstT in convolution parameters.