2024 |
2023 |
2022 |
2021 |
2020 |
2019 |
2018 |
2017 |
2016 |
2015 |
2014 |
2013
December 1, 2020 (version 4.6.96)
Algorithms
New features
- Base implementation of function AveragingBinarizationV2.
- SSE4.1, AVX2, AVX-512BW optimizations of function AlphaUnpremultiply.
Improving
- SSE2, AVX2, AVX-512BW and NEON optimizations of function MedianFilterSquare5x5.
- SSE2, AVX2, AVX-512F optimizations of function SynetSoftmaxLayerForward.
- Reducing of number of calling function CpuSocketNumber at initialization of Simd.
- Reducing of number of calling function CpuCoreNumber at initialization of Simd.
- Reducing of number of calling function CheckBit at initialization of Simd.
Bug fixing
- Compilation error in file SimdNeonSynetConvolution8i.cpp.
- Infinite loop in SynetConvolution32fNhwcDirect::OldReorderWeight (on Celeron CPU).
- Crash in SimdRuntime.h (on Celeron CPU).
- Crash in SimdGemm.h (on Celeron CPU).
- Function SimdSynetSpecifyTensorFormat returns incorrect value.
Test framework
New features
- Tests for verifying functionality of function AveragingBinarizationV2.
- Parameter '-lc' to litter CPU cache between tests run.
Infrastructure
New features
- MSVS projects can be used from external solution.
Removing
Home
November 4, 2020 (version 4.6.95)
Algorithms
New features
- AVX2, AVX-512BW and AVX-512VNNI optimizations of SynetMergedConvolution8iCdc class.
- AVX2, AVX-512BW and AVX-512VNNI optimizations of SynetMergedConvolution8iCd class.
- AVX2, AVX-512BW and AVX-512VNNI optimizations of SynetMergedConvolution8iDc class.
- SSE4.1, AVX2, AVX-512BW optimizations of function SynetConvert8uTo32f.
- Base implementation, SSE2, SSSE3 AVX2, AVX-512BW optimizations of function AlphaPremultiply.
- Base implementation of function AlphaUnpremultiply.
Bug fixing
- GCC v10 compilation error in file SimdGemm.h.
- Error in IECompatible method of SynetMergedConvolution8i.
Test framework
New features
- Tests for verifying functionality of function AlphaPremultiply.
- Tests for verifying functionality of function AlphaUnpremultiply.
Documentation
Bug fixing
- There are no references to C++ wrappers in description of API functions.
Home
October 1, 2020 (version 4.6.94)
Algorithms
New features
- Base implementation of SynetMergedConvolution8i class.
- Base implementation of function SynetConvert8uTo32f.
- Base implementation and SSE4.1 optimizations of SynetMergedConvolution8iCdc class.
- Base implementation and SSE4.1 optimizations of SynetMergedConvolution8iCd class.
- Base implementation and SSE4.1 optimizations of SynetMergedConvolution8iDc class.
Bug fixing
- Performance degradation in class Convolution32fNhwcDirect (weights size >> L3 cache).
- Performance degradation in class Convolution32fGemmNN (weights size >> L3 cache).
Test framework
New features
- Tests for verifying functionality of SynetMergedConvolution8i class.
- Tests for verifying functionality of function SynetConvert8uTo32f.
Documentation
Improving
- Improve structuring of Synet documentation.
Home
September 1, 2020 (version 4.6.93)
Algorithms
New features
- Full support of SimdConvolutionActivationType in SynetConvolution8i class.
- Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI optimizations of SynetConvolution8iNhwcDepthwise class.
- Extend class MergedConvolution32f (2 merged convolutions).
- Base implementation, SSE2, AVX, AVX2, AVX-512F optimizations of MergedConvolution32fCd class.
- Base implementation, SSE2, AVX, AVX2, AVX-512F optimizations of MergedConvolution32fDc class.
Improving
- Reducing of compilation time and assembled size of Simd Library.
Renaming
- Class MergedConvolution32f to MergedConvolution32fCdc.
Bug fixing
- Performance degradation in class Convolution32fNhwcDirect (dilation != 1).
Test framework
New features
- Tests for verifying functionality of class MergedConvolution32f (2 merged convolutions).
Home
August 3, 2020 (version 4.6.92)
Algorithms
New features
- Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetAdd8i.
- Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function SynetInnerProduct8i.
Improving
- Reducing of compilation time and assembled size of Simd Library.
Bug fixing
- Error in SSE4.1, AVX2, AVX-512BW optimizations of SynetScale8i class (wrong alignment check).
- Error in performance annotation of SynetConvolution8i class.
- Compiler error in file SimdBaseSynetConvolution8i.cpp (for old compilers).
- Compiler errors in files SimdAvx2Synet.cpp, SimdAvx2SynetScale.cpp (WIN32, MSVS).
Test framework
New features
- Tests for verifying functionality of function SynetAdd8i.
- Tests for verifying functionality of function SynetInnerProduct8i.
Home
July 1, 2020 (version 4.6.91)
Algorithms
New features
- Extend SimdSynetCompatibilityType enumeration.
- Add support of SimdSynetCompatibility8iNarrowed to Base implementation, SSE2, AVX2, AVX-512BW and NEON optimizations of function SynetConvert32fTo8u.
- Add support of SimdSynetCompatibility8iNarrowed to Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI and NEON optimizations of SynetConvolution8iNhwcDirect class.
- Add support of SimdConvolutionActivationPrelu to Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI and NEON optimizations of SynetConvolution8iNhwcDirect class.
- Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of SynetScale8i class.
Improving
- Reducing of size of applications or shared libraries which use Simd as static library.
Bug fixing
- Error in class SynetConvolution8i (batch > 1).
Test framework
New features
- Tests for verifying functionality of SynetScale8i framework.
Home
June 3, 2020 (version 4.6.90)
Algorithms
New features
- Rgb24 format in Frame structure.
- Rgb24 format in Convert function.
- Base implementation, SSSE3, AVX2, AVX-512BW and NEON optimizations of function RgbToGray.
- Base implementation, SSSE3, AVX2, AVX-512BW and NEON optimizations of function RgbToBgra.
- Base implementation, SSSE3, AVX2, AVX-512BW and NEON optimizations of function BgraToRgb.
- AVX2 optimization of function BgraToBgr.
- Function LitterCpuCache.
- Base implementation, SSSE3, AVX2, AVX-512BW and NEON optimizations of function Yuv444pToRgb.
- Base implementation, SSSE3, AVX2, AVX-512BW and NEON optimizations of function Yuv422pToRgb.
- Base implementation, SSSE3, AVX2, AVX-512BW and NEON optimizations of function Yuv420pToRgb.
Improving
- NEON optimization of function BgrToGray.
Bug fixing
- Error in class SynetConvolution8i (group != 1).
- Wrong assert condition in SSE2, AVX, AVX2, AVX-512F and NEON optimization of class Convolution32fNhwcDirect.
- Compiler error when SIMD_AVX2_DISABLE macro is uncommented.
- Int32 overflow in function SynetConvolution8i::SetParams.
Test framework
New features
- Tests for verifying functionality of function RgbToGray.
- Tests for verifying functionality of function RgbToBgra.
- Tests for verifying functionality of function BgraToRgb.
- Tests for verifying functionality of function Yuv444pToRgb.
- Tests for verifying functionality of function Yuv422pToRgb.
- Tests for verifying functionality of function Yuv420pToRgb.
Home
May 4, 2020 (version 4.6.89)
Algorithms
Bug fixing
- Microsoft Visual Studio 2013 compiler errors in files: SimdSynetConvolution8i.h, SimdSse2SynetConvolution32f.cpp, SimdAvx2Reduce.cpp.
- Buffer overrun in SSE4.1, AVX2, NEON optimizations of SynetConvolution8iNhwcDirect class.
- Visual Studio 2017 internal compiler error in function Avx512f::ConvolutionBiasAndActivation (Win32/Release).
- Compiler error in NEON optimization of class SynetConvolution8iNhwcDirect (ARM, 32-bit).
- Error in AVX2 optimization of function SynetScaleLayerForward.
- Error in base implementation of SquaredDifferenceKahanSum32f (Visual Studio 2017).
- Error in AVX-512BW optimization of class SynetConvolution8iNhwcDirect (Visual Studio 2017/2019, Release).
- Error in class SynetConvolution32fNhwcDirect (large parameters srcC and dstC).
Test framework
Bug fixing
- Microsoft Visual Studio 2013 compiler errors in files: TestTensor.h, TestSynetActivation.cpp.
- Test report is not generated if output directory is not exists.
- Error in test SynetConvert32fTo8uAutoTest.
Infrastructure
New features
- Script to test Simd compiled with different version of Microsoft Visual Studio.
- New structure of Microsoft Visual Studio 2019 project files.
Removing
- Remove project files of Microsoft Visual Studio 2012.
Home
April 1, 2020 (version 4.6.88)
Algorithms
New features
- AVX-512VNNI extension support.
- AVX2, AVX-512BW, AVX-512VNNI and NEON optimizations of SynetConvolution8iNhwcDirect class.
- Base implementation and SSE4.1, AVX2 AVX-512BW and NEON optimizations of function SynetPoolingForwardMax8u.
Renaming
- SynetPoolingForwardMax to SynetPoolingForwardMax32f.
Improving
- SSE4.1 optimization of SynetConvolution8iNhwcDirect class.
- SSE2, AVX, AVX2, AVX-512F and NEON optimizations of SynetConvolution32fNhwcDirect class.
Bug fixing
- Microsoft Visual Studio 2015 compiler error in function SynetConvert32fTo8u.
- Degradation of performance of AVX2 code.
- Microsoft Visual Studio compiler error in function Extract64i (32-bit mode).
Test framework
New features
- Tests for verifying functionality of function SynetPoolingForwardMax8u.
Home
March 2, 2020 (version 4.5.87)
Algorithms
New features
- Add parameter of bitwise compatibility of function SynetScaleLayerForward and Inference Engine.
- Add parameter 'type' to function SynetShuffleLayerForward.
- Base implementation, SSE2, AVX2, AVX-512BW amd NEON optimizations of function SynetConvert32fTo8u.
- SimdSynetCompatibilityType enumeration.
- Base implementation of SynetConvolution8iGemmNN class.
- Base implementation and SSE4.1 optimization of SynetConvolution8iNhwcDirect class.
Renaming
- SimdSynetConvertImage to SimdSynetReorderImage.
- SimdSynetConvertFilter to SimdSynetReorderFilter.
Test framework
New features
- A new commandline test parameter -c - a number of channels in test image for performance testing.
- A new commandline test parameter -mt - a minimal test execution time (in milliseconds).
- Tests for verifying functionality of SynetConvolution8i framework.
- Tests for verifying functionality of function SynetConvert32fTo8u.
Documentation
Bug fixing
- Error in description of method Detection::LoadStringXml.
Home
February 3, 2020 (version 4.5.86)
Algorithms
New features
- SimdResizeMethodInferenceEngineInterp method in Resizer framework.
Improving
- Performance of Convolution32f framework (NHWC format, kernel=3x3, stride=1x1, large H and W).
- Performance of AVX-512F and NEON optimizations of function GemmPackA.
- Performance of Convolution32f framework (NHWC format, GemmNN method).
- Performance of SSE2, AVX, AVX2, AVX-512F and NEON optimizations of Convolution32f framework (NHWC format, NhwcDirect method, kernel=1x1).
- Performance of AVX-512F optimization of MergedConvolution32f framework (input convolution).
- Performance of AVX2 and AVX-512F optimizations of MergedConvolution32f framework (output convolution).
- Performance of Convolution32f framework (stride > 1).
- Performance of AVX-512F optimization of Gemm32fNN function (add 6x64 and 6x48 micro kernel).
Bug fixing
- Error in AVX-512F optimization of function WinogradKernel3x3Block2x2SetOutput (NCHW format).
- Error in SSE, AVX, AVX-512F and NEON optimizations of function SynetPoolingForwardAverage (NHWC format).
- Error in AVX-512F optimization of function SynetInnerProductLayerForward.
- Error in AVX, AVX2 and AVX-512F optimizations of function Gemm32fNT.
- Error in function WinogradKernel3x3Block4x4SetInput (padX != padY != padW != padH).
- Error in debug FLOPS annotation of Deconvolution32f framework.
- MergedConvolution32f framework doesn't work with stride == 3.
Home
January 3, 2020 (version 4.5.85)
Algorithms
New features
- Base implementation, SSE2, AVX2, AVX-512F and NEON optimizations of function SynetUnaryOperation32fLayerForward.
- Base implementation, SSE2, AVX2, AVX-512F and NEON optimizations of function SynetSoftplus32f.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block2x2SetFilter.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block2x2SetInput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block2x2SetOutput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block4x4SetFilter.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block4x4SetInput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel2x2Block4x4SetOutput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x3Block1x4SetFilter.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x3Block1x4SetInput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x3Block1x4SetOutput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x5Block1x4SetFilter.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x5Block1x4SetInput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function WinogradKernel1x5Block1x4SetOutput.
Improving
- Performance of Convolution32f framework (NHWC format, kernel=1x1x1).
- Performance of Convolution32f framework (NHWC format, kernel=2x2).
- Performance of Convolution32f framework (NHWC format, kernel=1x3).
- Performance of Convolution32f framework (NHWC format, kernel=1x5).
Renaming
- NeuralSigmoid to SynetSigmoid32f.
- NeuralTanh to SynetTanh32f.
- NeuralRelu to SynetRelu32f.
- Winograd2x3SetFilter to WinogradKernel3x3Block2x2SetFilter.
- Winograd2x3SetInput to WinogradKernel3x3Block2x2SetInput.
- Winograd2x3SetOutput to WinogradKernel3x3Block2x2SetOutput.
- Winograd3x3SetFilter to WinogradKernel3x3Block3x3SetFilter.
- Winograd3x3SetInput to WinogradKernel3x3Block3x3SetInput.
- Winograd3x3SetOutput to WinogradKernel3x3Block3x3SetOutput.
- Winograd4x4SetFilter to WinogradKernel3x3Block4x4SetFilter.
- Winograd4x4SetInput to WinogradKernel3x3Block4x4SetInput.
- Winograd4x4SetOutput to WinogradKernel3x3Block4x4SetOutput.
Bug fixing
- Error in Convolution32f framework (kernel greater than input size, NHWC format).
- Potential crash in ContourDetector.
Test framework
New features
- Tests for verifying functionality of function SynetUnaryOperation32fLayerForward.
- Tests for verifying functionality of function SynetSoftplus32f.
- Tests for verifying functionality of function WinogradKernel2x2Block2x2SetFilter.
- Tests for verifying functionality of function WinogradKernel2x2Block2x2SetInput.
- Tests for verifying functionality of function WinogradKernel2x2Block2x2SetOutput.
- Tests for verifying functionality of function WinogradKernel2x2Block4x4SetFilter.
- Tests for verifying functionality of function WinogradKernel2x2Block4x4SetInput.
- Tests for verifying functionality of function WinogradKernel2x2Block4x4SetOutput.
- Tests for verifying functionality of function WinogradKernel1x3Block1x4SetFilter.
- Tests for verifying functionality of function WinogradKernel1x3Block1x4SetInput.
- Tests for verifying functionality of function WinogradKernel1x3Block1x4SetOutput.
- Tests for verifying functionality of function WinogradKernel1x5Block1x4SetFilter.
- Tests for verifying functionality of function WinogradKernel1x5Block1x4SetInput.
- Tests for verifying functionality of function WinogradKernel1x5Block1x4SetOutput.
Home
2024 |
2023 |
2022 |
2021 |
2020 |
2019 |
2018 |
2017 |
2016 |
2015 |
2014 |
2013
|