QuantizedInnerProductLayer functions
A framework to accelerate QuantizedInnerProductLayer in Synet Framework. More...
Functions | |
| SIMD_API void * | SimdSynetQuantizedInnerProductInit (size_t M, size_t N, size_t K, SimdTensorDataType typeA, SimdTensorDataType typeB, SimdTensorDataType typeC, SimdBool transB, SimdBool constB, SimdBool bias) |
| Initializes quantized inner product (matrix multiplication) algorithm. More... | |
| SIMD_API size_t | SimdSynetQuantizedInnerProductInternalBufferSize (const void *context) |
| Gets size in bytes of internal buffer used inside quantized inner product algorithm. More... | |
| SIMD_API size_t | SimdSynetQuantizedInnerProductExternalBufferSize (const void *context) |
| Gets size in bytes of external buffer used in quantized inner product algorithm. More... | |
| SIMD_API const char * | SimdSynetQuantizedInnerProductInfo (const void *context) |
| Gets string with description of internal implementation of quantized inner product algorithm. More... | |
| SIMD_API void | SimdSynetQuantizedInnerProductSetParams (void *context, const float *aScale, const uint8_t *aZero, const int8_t *b, const float *bScale, const int32_t *bias, const float *cScale, const uint8_t *cZero) |
| Sets weights, biases, input/output parameters required for quantized inner product algorithm. More... | |
| SIMD_API void | SimdSynetQuantizedInnerProductForward (void *context, const uint8_t *A, const uint8_t *B, uint8_t *buf, uint8_t *C) |
| Performs forward propagation of quantized inner product algorithm. More... | |
Detailed Description
A framework to accelerate QuantizedInnerProductLayer in Synet Framework.
Function Documentation
◆ SimdSynetQuantizedInnerProductInit()
| void * SimdSynetQuantizedInnerProductInit | ( | size_t | M, |
| size_t | N, | ||
| size_t | K, | ||
| SimdTensorDataType | typeA, | ||
| SimdTensorDataType | typeB, | ||
| SimdTensorDataType | typeC, | ||
| SimdBool | transB, | ||
| SimdBool | constB, | ||
| SimdBool | bias | ||
| ) |
Initializes quantized inner product (matrix multiplication) algorithm.
Algorithm's details (transpA = false, bias = true):
for(i = 0; i < M; ++i)
for(j = 0; j < N; ++j)
{
C[i,j] = bias[j];
for(k = 0; k < K; ++k)
C[i,j] += A[i,k] * B[k,j];
}
- Parameters
-
[in] M - a height of A and height of C matrices. [in] N - a width of B and width of C matrices. [in] K - a width of A and height of B matrices. [in] typeA - a type of A matrix. It can be FP32 or UINT8. [in] typeB - a type of B matrix. It can be FP32 or INT8. [in] typeC - a type of C matrix. It can be FP32 or UINT8. [in] transB - a transpose matrix B before multiplication. [in] constB - a matrix B is constant. [in] bias - a flag to add bias to output matrix C.
- Returns
- a pointer to quantized inner product context. On error it returns NULL. It must be released with using of function SimdRelease. This pointer is used in functions SimdSynetQuantizedInnerProductInternalBufferSize, SimdSynetQuantizedInnerProductExternalBufferSize, SimdSynetQuantizedInnerProductInfo, SimdSynetQuantizedInnerProductSetParams and SimdSynetQuantizedInnerProductForward.
◆ SimdSynetQuantizedInnerProductInternalBufferSize()
| size_t SimdSynetQuantizedInnerProductInternalBufferSize | ( | const void * | context | ) |
Gets size in bytes of internal buffer used inside quantized inner product algorithm.
- Parameters
-
[in] context - a pointer to quantized inner product context. It must be created by function SimdSynetQuantizedInnerProductInit and released by function SimdRelease.
- Returns
- size in bytes of internal buffer used inside quantized inner product algorithm.
◆ SimdSynetQuantizedInnerProductExternalBufferSize()
| size_t SimdSynetQuantizedInnerProductExternalBufferSize | ( | const void * | context | ) |
Gets size in bytes of external buffer used in quantized inner product algorithm.
- Parameters
-
[in] context - a pointer to quantized inner product context. It must be created by function SimdSynetQuantizedInnerProductInit and released by function SimdRelease.
- Returns
- size in bytes of external buffer used in quantized inner product algorithm.
◆ SimdSynetQuantizedInnerProductInfo()
| const char * SimdSynetQuantizedInnerProductInfo | ( | const void * | context | ) |
Gets string with description of internal implementation of quantized inner product algorithm.
- Parameters
-
[in] context - a pointer to quantized inner product context. It must be created by function SimdSynetQuantizedInnerProductInit and released by function SimdRelease.
- Returns
- string with description of internal implementation of quantized inner product algorithm.
◆ SimdSynetQuantizedInnerProductSetParams()
| void SimdSynetQuantizedInnerProductSetParams | ( | void * | context, |
| const float * | aScale, | ||
| const uint8_t * | aZero, | ||
| const int8_t * | b, | ||
| const float * | bScale, | ||
| const int32_t * | bias, | ||
| const float * | cScale, | ||
| const uint8_t * | cZero | ||
| ) |
Sets weights, biases, input/output parameters required for quantized inner product algorithm.
- Parameters
-
[in,out] context - a pointer to quantized inner product context. It must be created by function SimdSynetQuantizedInnerProductInit and released by function SimdRelease. [in] aScale - a pointer to 32-bit float point input A tensor scale. [in] aZero - a pointer to 8-bit unsigned integer input A tensor zero. [in] b - a pointer to 8-bit integer input B tensor. Can be NULL. [in] bScale - a pointer to 32-bit float point weight scale. [in] bias - a pointer to 32-bit integer bias. Can be NULL. [in] cScale - a pointer to 32-bit float point output C tensor scale. [in] cZero - a pointer to 8-bit unsigned integer output C tensor zero.
◆ SimdSynetQuantizedInnerProductForward()
| void SimdSynetQuantizedInnerProductForward | ( | void * | context, |
| const uint8_t * | A, | ||
| const uint8_t * | B, | ||
| uint8_t * | buf, | ||
| uint8_t * | C | ||
| ) |
Performs forward propagation of quantized inner product algorithm.
- Parameters
-
[in] context - a pointer to quantized inner product context. It must be created by function SimdSynetQuantizedInnerProductInit and released by function SimdRelease. [in] A - a pointer to A matrix. [in] B - a pointer to B matrix. Can be NULL if B is constant matrix. In that case you have to set B in function SimdSynetQuantizedInnerProductSetParams. [out] buf - a pointer to external buffer. The size of the external temporary buffer is determined by function SimdSynetQuantizedInnerProductExternalBufferSize. Can be NULL (it causes usage of internal buffer). [out] C - a pointer to output matrix.