Simd Library Documentation.

Home | Release Notes | Download | Documentation | Issues | GitHub
Quantized convolution framework

A framework to accelerate Quantized convolution in Synet Framework. More...

Functions

SIMD_API void * SimdSynetQuantizedConvolutionInit (size_t batch, const SimdConvolutionParameters *conv)
 Initializes UINT8-to-UINT8 quantized convolution algorithm. More...
 
SIMD_API size_t SimdSynetQuantizedConvolutionExternalBufferSize (const void *context)
 Gets size in bytes of external temporary buffer required for quantized convolution. More...
 
SIMD_API size_t SimdSynetQuantizedConvolutionInternalBufferSize (const void *context)
 Gets size in bytes of internal buffers allocated by quantized convolution context. More...
 
SIMD_API const char * SimdSynetQuantizedConvolutionInfo (const void *context)
 Gets description of selected quantized convolution implementation. More...
 
SIMD_API void SimdSynetQuantizedConvolutionSetParams (void *context, const float *ioScale, const uint8_t *ioZero, const int8_t *weight, const float *weightScale, const int32_t *bias, const float *params)
 Sets quantization parameters, weights, bias and activation parameters for quantized convolution. More...
 
SIMD_API void SimdSynetQuantizedConvolutionForward (void *context, const uint8_t *src, uint8_t *buf, uint8_t *dst)
 Performs forward propagation of quantized convolution. More...
 

Detailed Description

A framework to accelerate Quantized convolution in Synet Framework.

Function Documentation

◆ SimdSynetQuantizedConvolutionInit()

void * SimdSynetQuantizedConvolutionInit ( size_t  batch,
const SimdConvolutionParameters conv 
)

Initializes UINT8-to-UINT8 quantized convolution algorithm.

The convolution parameters have to describe a valid 2D convolution with equal source and destination tensor formats (SimdTensorFormatNchw or SimdTensorFormatNhwc) and UINT8 source and destination tensors. The implementation uses signed 8-bit weights, per-output-channel weight scales, optional bias and optional activation from SimdConvolutionActivationType.

Parameters
[in]batch- a batch size.
[in]conv- a pointer to convolution parameters (shape, kernel, stride, dilation, padding, group, tensor format, activation and data types).
Returns
a pointer to Quantized convolution context. On error it returns NULL. It must be released with using of function SimdRelease. This pointer is used in functions SimdSynetQuantizedConvolutionExternalBufferSize, SimdSynetQuantizedConvolutionInternalBufferSize, SimdSynetQuantizedConvolutionInfo, SimdSynetQuantizedConvolutionSetParams and SimdSynetQuantizedConvolutionForward.

◆ SimdSynetQuantizedConvolutionExternalBufferSize()

size_t SimdSynetQuantizedConvolutionExternalBufferSize ( const void *  context)

Gets size in bytes of external temporary buffer required for quantized convolution.

Parameters
[in]context- a pointer to Quantized convolution context. It must be created by function SimdSynetQuantizedConvolutionInit and released by function SimdRelease.
Returns
size in bytes of external temporary buffer required for quantized convolution. This value can be 0 or greater depending on selected implementation.

◆ SimdSynetQuantizedConvolutionInternalBufferSize()

size_t SimdSynetQuantizedConvolutionInternalBufferSize ( const void *  context)

Gets size in bytes of internal buffers allocated by quantized convolution context.

Parameters
[in]context- a pointer to Quantized convolution context. It must be created by function SimdSynetQuantizedConvolutionInit and released by function SimdRelease.
Returns
size in bytes of internal buffers used to store reordered weights, biases, quantization parameters and an optional fallback temporary buffer.

◆ SimdSynetQuantizedConvolutionInfo()

const char * SimdSynetQuantizedConvolutionInfo ( const void *  context)

Gets description of selected quantized convolution implementation.

Parameters
[in]context- a pointer to Quantized convolution context. It must be created by function SimdSynetQuantizedConvolutionInit and released by function SimdRelease.
Returns
string with description of selected implementation (extension and algorithm name).

◆ SimdSynetQuantizedConvolutionSetParams()

void SimdSynetQuantizedConvolutionSetParams ( void *  context,
const float *  ioScale,
const uint8_t *  ioZero,
const int8_t *  weight,
const float *  weightScale,
const int32_t *  bias,
const float *  params 
)

Sets quantization parameters, weights, bias and activation parameters for quantized convolution.

Parameter ioScale contains source, intermediate and destination scales in this order. Parameter ioZero contains source, intermediate and destination zero points in the same order. The implementation folds source zero into bias and computes per-output-channel normalization as srcScale*weightScale[c]/dstScale for identity activation or srcScale*weightScale[c]/intScale for other activations.

Parameters
[in,out]context- a pointer to Quantized convolution context. It must be created by function SimdSynetQuantizedConvolutionInit and released by function SimdRelease.
[in]ioScale- a pointer to 3 FP32 scales: input, intermediate and output.
[in]ioZero- a pointer to 3 UINT8 zero points: input, intermediate and output.
[in]weight- a pointer to INT8 convolution weights. Its layout is defined by convolution tensor format.
[in]weightScale- a pointer to per-output-channel FP32 weight scales. The size of the array must be equal to conv->dstC.
[in]bias- a pointer to per-output-channel INT32 bias. Can be NULL.
[in]params- a pointer to FP32 activation parameters (see SimdConvolutionActivationType). Can be NULL.

◆ SimdSynetQuantizedConvolutionForward()

void SimdSynetQuantizedConvolutionForward ( void *  context,
const uint8_t *  src,
uint8_t *  buf,
uint8_t *  dst 
)

Performs forward propagation of quantized convolution.

Parameters
[in]context- a pointer to Quantized convolution context. It must be created by function SimdSynetQuantizedConvolutionInit and released by function SimdRelease.
[in]src- a pointer to UINT8 input tensor with size batch*srcC*srcH*srcW.
[out]buf- a pointer to external temporary buffer. Its size is determined by function SimdSynetQuantizedConvolutionExternalBufferSize. Can be NULL (then context uses an internal buffer).
[out]dst- a pointer to UINT8 output tensor with size batch*dstC*dstH*dstW.