Matrix operations. More...
Functions | |
| SIMD_API void | SimdGemm32fNN (size_t M, size_t N, size_t K, const float *alpha, const float *A, size_t lda, const float *B, size_t ldb, const float *beta, float *C, size_t ldc) |
| Performs general matrix multiplication for row-major 32-bit floating-point matrices. More... | |
| SIMD_API void | SimdGemm32fNT (size_t M, size_t N, size_t K, const float *alpha, const float *A, size_t lda, const float *B, size_t ldb, const float *beta, float *C, size_t ldc) |
| Performs general matrix multiplication with transposed B for row-major 32-bit floating-point matrices. More... | |
Detailed Description
Matrix operations.
Function Documentation
◆ SimdGemm32fNN()
| void SimdGemm32fNN | ( | size_t | M, |
| size_t | N, | ||
| size_t | K, | ||
| const float * | alpha, | ||
| const float * | A, | ||
| size_t | lda, | ||
| const float * | B, | ||
| size_t | ldb, | ||
| const float * | beta, | ||
| float * | C, | ||
| size_t | ldc | ||
| ) |
Performs general matrix multiplication for row-major 32-bit floating-point matrices.
A and B are used without transposition:
for(i = 0; i < M; ++i)
for(j = 0; j < N; ++j)
C[i*ldc + j] = alpha[0]*Sum(A[i*lda + k]*B[k*ldb + j]) + beta[0]*C[i*ldc + j];
- Note
- This function supports multithreading (See functions SimdGetThreadNumber and SimdSetThreadNumber).
- Parameters
-
[in] M - a height of A and height of C matrices. [in] N - a width of B and width of C matrices. [in] K - a width of A and height of B matrices. [in] alpha - a pointer to scalar multiplier of A*B. [in] A - a pointer to input A matrix. [in] lda - a row stride of A matrix (in 32-bit floats). [in] B - a pointer to input B matrix. [in] ldb - a row stride of B matrix (in 32-bit floats). [in] beta - a pointer to scalar multiplier of the original C matrix. [out] C - a pointer to input/output C matrix. [in] ldc - a row stride of C matrix (in 32-bit floats).
◆ SimdGemm32fNT()
| void SimdGemm32fNT | ( | size_t | M, |
| size_t | N, | ||
| size_t | K, | ||
| const float * | alpha, | ||
| const float * | A, | ||
| size_t | lda, | ||
| const float * | B, | ||
| size_t | ldb, | ||
| const float * | beta, | ||
| float * | C, | ||
| size_t | ldc | ||
| ) |
Performs general matrix multiplication with transposed B for row-major 32-bit floating-point matrices.
A is an M by K row-major matrix. B is stored as an N by K row-major matrix and is used as Trans(B) in the multiplication:
for(i = 0; i < M; ++i)
for(j = 0; j < N; ++j)
C[i*ldc + j] = alpha[0]*Sum(A[i*lda + k]*B[j*ldb + k]) + beta[0]*C[i*ldc + j];
- Note
- This function supports multithreading (See functions SimdGetThreadNumber and SimdSetThreadNumber).
- Parameters
-
[in] M - a height of A and height of C matrices. [in] N - a height of B and width of C matrices. [in] K - a width of A and width of B matrices. [in] alpha - a pointer to scalar multiplier of A*Trans(B). [in] A - a pointer to input A matrix. [in] lda - a row stride of A matrix (in 32-bit floats). [in] B - a pointer to input B matrix. [in] ldb - a row stride of B matrix (in 32-bit floats). [in] beta - a pointer to scalar multiplier of the original C matrix. [out] C - a pointer to input/output C matrix. [in] ldc - a row stride of C matrix (in 32-bit floats).