Simd Library Documentation.

Home | Release Notes | Download | Documentation | Issues | GitHub
QuantizedInnerProductLayer functions

A framework to accelerate QuantizedInnerProductLayer in Synet Framework. More...

Functions

SIMD_API void * SimdSynetQuantizedInnerProductInit (size_t M, size_t N, size_t K, SimdTensorDataType typeA, SimdTensorDataType typeB, SimdTensorDataType typeC, SimdBool transB, SimdBool constB, SimdBool bias)
 Initializes quantized inner product (matrix multiplication) algorithm for UINT8 input, INT8 weight and UINT8 output. More...
 
SIMD_API size_t SimdSynetQuantizedInnerProductInternalBufferSize (const void *context)
 Gets size in bytes of internal buffers allocated by quantized inner product context. More...
 
SIMD_API size_t SimdSynetQuantizedInnerProductExternalBufferSize (const void *context)
 Gets size in bytes of external temporary buffer required for quantized inner product. More...
 
SIMD_API const char * SimdSynetQuantizedInnerProductInfo (const void *context)
 Gets description of selected quantized inner product implementation. More...
 
SIMD_API void SimdSynetQuantizedInnerProductSetParams (void *context, const float *aScale, const uint8_t *aZero, const int8_t *b, const float *bScale, const int32_t *bias, const float *cScale, const uint8_t *cZero)
 Sets constant matrix B, bias and quantization parameters for quantized inner product. More...
 
SIMD_API void SimdSynetQuantizedInnerProductForward (void *context, const uint8_t *A, const uint8_t *B, uint8_t *buf, uint8_t *C)
 Performs forward propagation of quantized inner product. More...
 

Detailed Description

A framework to accelerate QuantizedInnerProductLayer in Synet Framework.

Function Documentation

◆ SimdSynetQuantizedInnerProductInit()

void * SimdSynetQuantizedInnerProductInit ( size_t  M,
size_t  N,
size_t  K,
SimdTensorDataType  typeA,
SimdTensorDataType  typeB,
SimdTensorDataType  typeC,
SimdBool  transB,
SimdBool  constB,
SimdBool  bias 
)

Initializes quantized inner product (matrix multiplication) algorithm for UINT8 input, INT8 weight and UINT8 output.

The current implementation requires constB to be SimdTrue. Matrix B is supplied to SimdSynetQuantizedInnerProductSetParams and may be stored transposed according to transB.

Algorithm's details before requantization (transB = false, bias = true):

for(i = 0; i < M; ++i)
    for(j = 0; j < N; ++j)
    {
        sum = bias[j] - aZero*Sum(B[:,j]);
        for(k = 0; k < K; ++k)
            sum += A[i,k] * B[k,j];
        C[i,j] = RestrictRange(Round(sum*aScale*bScale[j]/cScale) + cZero, 0, 255);
    }
Parameters
[in]M- a height of A and height of C matrices.
[in]N- a width of B and width of C matrices.
[in]K- a width of A and height of B matrices.
[in]typeA- a type of A matrix. Currently it must be SimdTensorData8u.
[in]typeB- a type of B matrix. Currently it must be SimdTensorData8i.
[in]typeC- a type of C matrix. Currently it must be SimdTensorData8u.
[in]transB- a flag that matrix B is stored transposed (N*K instead of K*N).
[in]constB- a flag that matrix B is constant. Currently it must be SimdTrue.
[in]bias- a flag to add bias to output matrix C.
Returns
a pointer to quantized inner product context. On error it returns NULL. It must be released with using of function SimdRelease. This pointer is used in functions SimdSynetQuantizedInnerProductInternalBufferSize, SimdSynetQuantizedInnerProductExternalBufferSize, SimdSynetQuantizedInnerProductInfo, SimdSynetQuantizedInnerProductSetParams and SimdSynetQuantizedInnerProductForward.

◆ SimdSynetQuantizedInnerProductInternalBufferSize()

size_t SimdSynetQuantizedInnerProductInternalBufferSize ( const void *  context)

Gets size in bytes of internal buffers allocated by quantized inner product context.

Parameters
[in]context- a pointer to quantized inner product context. It must be created by function SimdSynetQuantizedInnerProductInit and released by function SimdRelease.
Returns
size in bytes of internal buffers used to store constant B, bias, zero points, scales and an optional fallback temporary buffer.

◆ SimdSynetQuantizedInnerProductExternalBufferSize()

size_t SimdSynetQuantizedInnerProductExternalBufferSize ( const void *  context)

Gets size in bytes of external temporary buffer required for quantized inner product.

Parameters
[in]context- a pointer to quantized inner product context. It must be created by function SimdSynetQuantizedInnerProductInit and released by function SimdRelease.
Returns
size in bytes of external temporary buffer required by SimdSynetQuantizedInnerProductForward.

◆ SimdSynetQuantizedInnerProductInfo()

const char * SimdSynetQuantizedInnerProductInfo ( const void *  context)

Gets description of selected quantized inner product implementation.

Parameters
[in]context- a pointer to quantized inner product context. It must be created by function SimdSynetQuantizedInnerProductInit and released by function SimdRelease.
Returns
string with description of selected implementation (extension and algorithm name).

◆ SimdSynetQuantizedInnerProductSetParams()

void SimdSynetQuantizedInnerProductSetParams ( void *  context,
const float *  aScale,
const uint8_t *  aZero,
const int8_t *  b,
const float *  bScale,
const int32_t *  bias,
const float *  cScale,
const uint8_t *  cZero 
)

Sets constant matrix B, bias and quantization parameters for quantized inner product.

Parameters
[in,out]context- a pointer to quantized inner product context. It must be created by function SimdSynetQuantizedInnerProductInit and released by function SimdRelease.
[in]aScale- a pointer to FP32 quantization scale of A matrix.
[in]aZero- a pointer to UINT8 quantization zero of A matrix.
[in]b- a pointer to constant INT8 B matrix. It must be valid when constB is SimdTrue.
[in]bScale- a pointer to per-output-channel FP32 scales of B matrix. The size of the array must be equal to N.
[in]bias- a pointer to INT32 bias values. The size of the array must be equal to N. Can be NULL.
[in]cScale- a pointer to FP32 quantization scale of C matrix.
[in]cZero- a pointer to UINT8 quantization zero of C matrix.

◆ SimdSynetQuantizedInnerProductForward()

void SimdSynetQuantizedInnerProductForward ( void *  context,
const uint8_t *  A,
const uint8_t *  B,
uint8_t *  buf,
uint8_t *  C 
)

Performs forward propagation of quantized inner product.

Parameters
[in]context- a pointer to quantized inner product context. It must be created by function SimdSynetQuantizedInnerProductInit and released by function SimdRelease.
[in]A- a pointer to UINT8 A matrix with size M*K.
[in]B- a pointer to INT8 B matrix. Can be NULL when B was set by SimdSynetQuantizedInnerProductSetParams.
[out]buf- a pointer to external buffer. The size of the external temporary buffer is determined by function SimdSynetQuantizedInnerProductExternalBufferSize. Can be NULL (it causes usage of internal buffer).
[out]C- a pointer to UINT8 C matrix with size M*N.