2025 |
 2024 |
 2023 |
 2022 |
 2021 |
 2020 |
 2019 |
 2018 |
 2017 |
 2016 |
 2015 |
 2014 |
 2013
 
December 2, 2019 (version 4.4.84)
Algorithms
New features
 - Method View::Clear.
 
 - Parameter makeCopy in method ShiftDetector::SetBackground.
 
 - Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetPoolingForwardAverage.
 
 
Improving
 - SSE, AVX, AVX2, AVX-512F and NEON optimizations of Convolution32f framework.
 
 
Bug fixing
 - Crash when defined SIMD_PERFORMANCE_STATISTIC.
 
 - Compiler warning in SSSE3 and AVX2 optimizations of Resizer.
 
 - Error in base implementation of function SquaredDifferenceKahanSum32f (Visual Studio 2019).
 
 
Test framework
New features
 - Tests for verifying functionality of function SynetPoolingForwardAverage.
 
 
Home 
 
November 1, 2019 (version 4.4.83) 
Algorithms
New features
 - Base implementation, SSE4.1, AVX2, AVX-512BW and NEON optimizations of function SynetSetInput.
 
 - Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetHswish32f.
 
 - Support of Hswish activation function in Convolution32f framework.
 
 - Support of Hswish activation function in MergedConvolution32f framework.
 
 - Support of Hswish activation function in Deconvolution32f framework.
 
 - Support of 5x5 and 7x7 depthwise convolution in the middle layer of MergedConvolution32f framework.
 
 - Base implementation, SSE, AVX, AVX-512BW and NEON optimizations of function SynetShuffleLayerForward.
 
 - Base implementation, SSE2, AVX2, AVX-512BW and NEON optimizations of function GetObjectMoments.
 
 
Improving
 - SSE2, AVX2, AVX-512BW and NEON optimizations of function GetObjectMoments.
 
 - NEON optimization of function Gemm32fNN.
 
 - NEON optimization of function Gemm32fNT.
 
 - NEON optimization of Convolution32f framework.
 
 - NEON optimization of MergedConvolution32f framework.
 
 - NEON optimization of Deconvolution32f framework.
 
 
Renaming
 - Function from SynetRestrictRange to SynetRestrictRange32f.
 
 
Bug fixing
 - GCC-4.9 compiler error in function Base::CpuCacheSize.
 
 - Error in SSE2 optimization of Resizer framework.
 
 
Test framework
New features
 - Tests for verifying functionality of function SynetSetInput.
 
 - Tests for verifying functionality of function SynetHswish32f.
 
 - Tests for verifying functionality of function SynetShuffleLayerForward.
 
 - Tests for verifying functionality of function GetObjectMoments.
 
 
Infrastructure
Bug fixing
 - Missing of file Prop.props for Microsoft Visual Studio 2019.
 
 
Home 
 
October 1, 2019 (version 4.4.82) 
Algorithms
New features
 - View::Clone method (it creates clone on the base of external buffer).
 
 - Function Simd::PrintInfo.
 
 - SynetDeconvolution32f Framework.
 
 - Base implementation, SSE2, AVX, AVX2, AVX-512F and NEON optimizations of SynetDeconvolution32fGemmNN class.
 
 - Base implementation, SSE2, AVX, AVX2, AVX-512F and NEON optimizations of SynetDeconvolution32fNhwcDirect2x2 class.
 
 
Improving
 - Now CpuInfo gets L1D, L2, L3 cache sizes, numbers of sockets, cpus and threads.
 
 
Renaming
 - Function from ConvolutionInit to SynetConvolution32fInit.
 
 - Function from ConvolutionExternalBufferSize to SynetConvolution32fExternalBufferSize.
 
 - Function from ConvolutionInternalBufferSize to SynetConvolution32fInternalBufferSize.
 
 - Function from ConvolutionSetParams to SynetConvolution32fSetParams.
 
 - Function from ConvolutionForward to SynetConvolution32fForward.
 
 - Function from MergedConvolutionInit to SynetMergedConvolution32fInit.
 
 - Function from MergedConvolutionExternalBufferSize to SynetMergedConvolution32fExternalBufferSize.
 
 - Function from MergedConvolutionInternalBufferSize to SynetMergedConvolution32fInternalBufferSize.
 
 - Function from MergedConvolutionSetParams to SynetMergedConvolution32fSetParams.
 
 - Function from MergedConvolutionForward to SynetMergedConvolution32fForward.
 
 
Bug fixing
 - Error in Resizer framework (in file SimdBaseResizer.cpp).
 
 
Test framework
New features
 - Tests for verifying functionality of SynetDeconvolution32f Framework.
 
 
Infrastructure
New features
 - Project files for Microsoft Visual Studio 2019.
 
 
Bug fixing
 - Some Microsoft Visual Studio project properties can cause program crash at old CPUs.
 
 - Using of AVX512 property instead of SIMD_AVX512 in CMakeLists.txt.
 
 
Home 
 
September 2, 2019 (version 4.3.81) 
Algorithms
New features
 - SimdTensorFormatNchwXc and SimdTensorFormatOyxiXo types in SimdTensorFormatType enumeration.
 
 - Function SynetSpecifyTensorFormat.
 
 - Function SynetTensorAlignment.
 
 - Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetAddBias.
 
 - Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetScaleLayerForward.
 
 - Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward0.
 
 - Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward1.
 
 - Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward2.
 
 - Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward3.
 
 - Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward4.
 
 - Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward8.
 
 - Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward9.
 
 - Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetLrnLayerCrossChannels.
 
 - Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetPreluLayerForward.
 
 - Support of P2(pgm) and P3(ppm) image formats in View::Load.
 
 - Base implementation, SSE2, AVX2, AVX-512F and NEON optimizations of function SynetElu32f.
 
 - Support of Elu activation function in Convolution framework.
 
 - Support of Elu activation function in MergedConvolution framework.
 
 - New meaning of add parameter in MergedConvolution framework.
 
 
Improving
 - Performance measurement in Convolution and MergedConvolution frameworks.
 
 
Bug fixing
 - Error in function Convert (in file SimdFrame.hpp).
 
 - Error in function MergedConvolutionForward.
 
 
Test framework
New features
 - Tests for verifying functionality of function SynetAddBias for NCHW4c, NCHW8c, NCHW16c tensor formats.
 
 - Tests for verifying functionality of function SynetScaleLayerForward for NCHW4c, NCHW8c, NCHW16c tensor formats.
 
 - Tests for verifying functionality of function SynetFusedLayerForward0 for NCHW4c, NCHW8c, NCHW16c tensor formats.
 
 - Tests for verifying functionality of function SynetFusedLayerForward1 for NCHW4c, NCHW8c, NCHW16c tensor formats.
 
 - Tests for verifying functionality of function SynetFusedLayerForward2 for NCHW4c, NCHW8c, NCHW16c tensor formats.
 
 - Tests for verifying functionality of function SynetFusedLayerForward3 for NCHW4c, NCHW8c, NCHW16c tensor formats.
 
 - Tests for verifying functionality of function SynetFusedLayerForward4 for NCHW4c, NCHW8c, NCHW16c tensor formats.
 
 - Tests for verifying functionality of function SynetFusedLayerForward8 for NCHW4c, NCHW8c, NCHW16c tensor formats.
 
 - Tests for verifying functionality of function SynetFusedLayerForward9 for NCHW4c, NCHW8c, NCHW16c tensor formats.
 
 - Tests for verifying functionality of function SynetLrnLayerCrossChannels for NCHW4c, NCHW8c, NCHW16c tensor formats.
 
 - Tests for verifying functionality of function SynetPreluLayerForward for NCHW4c, NCHW8c, NCHW16c tensor formats.
 
 - Base implementation, SSE2, AVX2, AVX-512F and NEON optimizations of function SynetElu32f.
 
 
Infrastructure
Renaming
 - Parameter from AVX512 to SIMD_AVX512 in CMakeLists.txt.
 
 - Parameter from PRINT_INFO to SIMD_INFO in CMakeLists.txt.
 
 
Home 
 
August 1, 2019 (version 4.3.80) 
Algorithms
New features
 - Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetFusedLayerForward8.
 
 - Partial batch merging in Convolution algorithm (Winograd and GemmNN methods).
 
 - Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function Winograd3x3SetFilter.
 
 - Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function Winograd3x3SetInput.
 
 - Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function Winograd3x3SetOutput.
 
 - Winograd3x3 method in Convolution algorithm.
 
 - Runtime choice of best micro kernel in Convolution Framework (GemmNN and Winograd methods).
 
 - Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetFusedLayerForward9.
 
 - SimdTensorFormatType enumeration.
 
 - Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetConvertImage.
 
 - Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetConvertFilter.
 
 
Improving
 - Performance profiling.
 
 - SSE, AVX, AVX2, AVX-512F and NEON optimizations of MergedConvolution framework.
 
 - SSE, AVX, AVX2, AVX-512F and NEON optimizations of Convolution Framework (GemmNN and Winograd methods).
 
 
Bug fixing
 - Error in Convolution Framework (GemmNN method).
 
 - Low performance of NEON optimization in Convolution Framework (GemmNN and Winograd methods).
 
 - Crash in base implementation of in functions FillPixel, FillBgra, FillUv (GCC, -O3).
 
 
Test framework
New features
 - Tests for verifying functionality of function SynetFusedLayerForward8.
 
 - Tests for verifying functionality of function Winograd3x3SetFilter.
 
 - Tests for verifying functionality of function Winograd3x3SetInput.
 
 - Tests for verifying functionality of function Winograd3x3SetOutput.
 
 - Special complex tests for verifying functionality of functions Winograd2x3SetFilter, Winograd2x3SetInput and Winograd2x3SetOutput.
 
 - Special complex tests for verifying functionality of functions Winograd3x3SetFilter, Winograd3x3SetInput and Winograd3x3SetOutput.
 
 - Special complex tests for verifying functionality of functions Winograd4x3SetFilter, Winograd4x3SetInput and Winograd4x3SetOutput.
 
 - Tests for verifying functionality of function SynetFusedLayerForward9.
 
 - Tests for verifying functionality of function SynetConvertImage.
 
 - Tests for verifying functionality of function SynetConvertFilter.
 
 
Infrastructure
New features
 - SIMD_PERF parameter in CMakeLists.txt.
 
 
Bug fixing
 - Visual Studio project build error (in file GetVersion.cmd).
 
 
Home 
 
July 2, 2019 (version 4.3.79)
Algorithms
New features
 - Additional macros for performance profiling.
 
 - Add function SimdPerformanceStatistic.
 
 - Base implementation, SSE, AVX, AVX2 AVX-512F and NEON optimizations of Convolution framework (NhwcDirect mode).
 
 
Improving
 - SSE, AVX, AVX2, AVX-512F and NEON optimizations of MergedConvolution framework.
 
 
Bug fixing
 - Error in function MergedConvolution::SetSize (Merged Convolution Framework).
 
 
Home 
 
June 3, 2019 (version 4.3.78) 
Algorithms
New features
 - SimdConvolutionParameters structure.
 
 - Base implementation, SSE, AVX, AVX2, AVX-512F and NEON optimizations of MergedConvolution framework (version 2).
 
 - Base implementation, AVX2 optimizations of function AbsDifference.
 
 - SSSE3 and NEON optimizations of function TransformImage (TransformTransposeRotate0 transformation).
 
 
Bug fixing
 - Error in Convolution framework (group != 1, NHWC mode).
 
 
Test framework
New features
 - Tests for verifying functionality of function AbsDifference.
 
 
Bug fixing
 - Compiler error in file TestResize.cpp (aarch64 toolchain).
 
 
Home 
 
May 2, 2019 (version 4.3.77) 
Algorithms
New features
 - Base implementation, SSE2, AVX2, AVX-512F and NEON optimizations of function SynetLrnLayerCrossChannels(NHWC mode).
 
 - Base implementation, SSSE3, AVX2, AVX-512BW and NEON optimizations of function BgrToRgb.
 
 - Pixel::Rgb24 structure.
 
 - Base implementation of Resizer framework (area method, byte type).
 
 - SSE2, SSSE3, AVX2, AVX-512BW and NEON optimizations of Resizer framework (bilinear method, byte type).
 
 - SSE2, SSE4.1, AVX2, AVX-512BW and NEON optimizations of Resizer framework (area method, byte type).
 
 - Simd::Resize function.
 
 - Base implementation, SSE, AVX, AVX2 and AVX-512F optimizations of MergedConvolution framework.
 
 
Improving
 - AVX-512F optimization of Convolution framework.
 
 
Bug fixing
 - Error in SSE, AVX, AVX-512F and NEON optimizations of function Fill32f.
 
 - Out of range in SSE4.1, AVX2, AVX-512BW and NEON optimizations of functions DetectionHaarDetect32fp and DetectionHaarDetect32fi.
 
 - Out of range in SSE4.1, AVX2, AVX-512BW and NEON optimizations of functions DetectionLbpDetect32fp, DetectionLbpDetect32fi, DetectionLbpDetect16ip and DetectionLbpDetect16ii.
 
 - Error in AVX2, AVX-512BW and NEON optimizations of function CosineDistancesMxNa16f.
 
 - Error in AVX-512F optimization of function Convolution framework.
 
 - Error in SSE, AVX, AVX2, AVX-512F and NEON optimizations of Convolution framework (NHWC mode, depthwise convolution).
 
 - Error in AVX-512F optimization of Convolution framework (NHWC mode, winograd2x3 method).
 
 - Error in AVX-512F optimization of Convolution framework (function KernelHwcDefaultBody8).
 
 
Test framework
New features
 - Tests for verifying functionality of function SynetLrnLayerCrossChannels (NHWC mode).
 
 - Tests for verifying functionality of function BgrToRgb.
 
 - Tests for verifying functionality of MergedConvolution framework.
 
 
Infrastructure
Bug fixing
 - Compiler warning for GCC >= 7.0 (ARM target).
 
 
Home 
 
April 1, 2019 (version 4.3.76) 
Algorithms
New features
 - Base implementation, AVX2, AVX-512BW and NEON optimizations of function CosineDistancesMxNa16f.
 
 - Macro SIMD_FUTURE_DISABLE.
 
 - Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function Winograd4x3SetInput(NHWC mode).
 
 - Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function Winograd4x3SetOutput(NHWC mode).
 
 - Support of Winograd4x3 for Convolution framework (NHWC mode).
 
 - Parameter 'batch' in Convolution framework.
 
 - Function ConvolutionInternalBufferSize.
 
 - Use parameter trans instead of parameters srcT and dstT  in function ConvolutionInit.
 
 
Improving
 - ConvolutionGemmNN method of Convolution framework (NHWC mode).
 
 - ConvolutionWinograd method of Convolution framework (NHWC mode).
 
 
Renaming
 - Function from GetFlushToZero to GetFastMode.
 
 - Function from SetFlushToZero to SetFastMode.
 
 - Function from ConvolutionBufferSize to ConvolutionExternalBufferSize.
 
 
Bug fixing
 - Compiler error (using of name 'small' which can be system macro) in file SimdSse2Statistic.cpp.
 
 - Compiler warning (unused variable) in function Neon::SetFlushToZero.
 
 - Compiler warning (unused variable) in function Base::ConvolutionBiasAndActivation.
 
 - Compiler error (Visual Studio for Android) in file SimdSsse3Transform.cpp.
 
 - Compiler error (Visual Studio for Android) in function SimdCosineDistance16f.
 
 - Low performance of function SimdSquaredDifferenceSum16f.
 
 - Compiler warning (unused variable) in function Neon::AlphaFilling.
 
 - Compiler warning (unused variable) in function Neon::Fill32f.
 
 - Compiler warning (wrong initialization order) in file SimdNeonGemm32f.cpp.
 
 - Compiler warning (unused variable) in function Neon::SynetInnerProductLayerForward.
 
 - Compiler warning (unused variable) in file TestConvolution.cpp.
 
 - Compiler internal error (G++ 6.3.0) in function Neon::BgrToBgra.
 
 - Compiler error (aarch64) in functions Neon::GetFlushToZero and Neon::SetFlushToZero.
 
 - Error in NEON optimization of function HogLiteFindMax7x7.
 
 - Denormals performance bug.
 
 - Error in NEON optimization of function ReduceGray2x2.
 
 
Test framework
New features
 - Tests for verifying functionality of function CosineDistancesMxNa16f.
 
 - Tests for verifying functionality of function Winograd4x3SetInput (NHWC mode).
 
 - Tests for verifying functionality of function Winograd4x3SetOutput (NHWC mode).
 
 
Bug fixing
 - Compiler error (Visual Studio for Android) in file TestFloat16.cpp.
 
 - Compiler warning (wrong initialization order) in file SimdNeonGemm32f.cpp.
 
 - Compiler internal error (G++ 4.9).
 
 
Home 
 
March 7, 2019 (version 4.3.75)
Algorithms
New features
 - Base implementation, SSE2, SSSE3, AVX2 and AVX-512BW optimizations of function BgraToYuva420p.
 
 - NEON optimization of function NeuralSigmoid.
 
 - NEON optimization of function NeuralTanh.
 
 - NEON optimization of function NeuralPow.
 
 - NEON version of functions GetFlushToZero and SetFlushToZero.
 
 - NEON optimization of function Fill32f.
 
 - NEON optimization of function AlphaFilling.
 
 - NEON optimization of function CosineDistance16f.
 
 - NEON optimization of function CosineDistance32f.
 
 - NEON optimization of function Gemm32fNN.
 
 - NEON optimization of function Gemm32fNT.
 
 - NEON optimization of function FillPixel.
 
 - NEON optimization of function ReduceColor2x2.
 
 - NEON optimization of function BayerToBgra.
 
 - NEON optimization of function BayerToBgr.
 
 - NEON optimization of function TransformImage.
 
 - NEON optimization of function BgraToYuva420p.
 
 - NEON optimization of function Yuva420pToBgra.
 
 - NEON optimization of function Resizer.
 
 - NEON optimization of function HogLiteFindMax7x7.
 
 - NEON optimization of function HogLiteCreateMask.
 
 - NEON optimization of function HogLiteFilterSeparable.
 
 - NEON optimization of function HogLiteCompressFeatures.
 
 - NEON optimization of function HogLiteResizeFeatures.
 
 - NEON optimization of function HogLiteFilterFeatures.
 
 - NEON optimization of function HogLiteExtractFeatures.
 
 - NEON optimization of function Winograd2x3SetFilter.
 
 - NEON optimization of function Winograd4x3SetFilter.
 
 - NEON optimization of function Winograd2x3SetInput.
 
 - NEON optimization of function Winograd2x3SetOutput.
 
 - NEON optimization of function SynetAddBias.
 
 - NEON optimization of function SynetEltwiseLayerForward.
 
 - NEON optimization of function SynetPoolingForwardMax.
 
 - NEON optimization of function SynetFusedLayerForward0.
 
 - NEON optimization of function SynetFusedLayerForward1.
 
 - NEON optimization of function SynetFusedLayerForward2.
 
 - NEON optimization of function SynetFusedLayerForward3.
 
 - NEON optimization of function SynetFusedLayerForward4.
 
 - NEON optimization of function SynetInnerProductLayerForward.
 
 - NEON optimization of function SynetLrnLayerCrossChannels.
 
 - NEON optimization of function SynetPreluLayerForward.
 
 - NEON optimization of function SynetRestrictRange.
 
 - NEON optimization of function SynetScaleLayerForward.
 
 - NEON optimization of function SynetSoftmaxLayerForward.
 
 - NEON optimization of function ConvolutionForward.
 
 
Improving
 - AVX, AVX2 and AVX-512F optimizations of function ConvolutionForward.
 
 - SSE, AVX, AVX2 and AVX-512F optimizations of function Resizer.
 
 
Bug fixing
 - Error in AVX-512BW optimization of function ChangeColors.
 
 - Error in AVX-512BW optimization of function NormalizeHistogram.
 
 - Error in AVX-512F optimization of function NeuralConvolutionForward.
 
 - Error in NEON optimization of function Uint8ToFloat32.
 
 - Error in NEON optimization of function SquaredDifferenceSum16f.
 
 - Error in SSE version of functions GetFlushToZero.
 
 - Error in Base implementation of function SynetFusedLayerForward0.
 
 
Test framework
New features
 - Tests for verifying functionality of function BgraToYuva420p.
 
 - Tests for verifying NEON optimization of of function NeuralSigmoid.
 
 - Tests for verifying NEON optimization of of function NeuralTanh.
 
 - Tests for verifying NEON optimization of of function NeuralPow.
 
 - Tests for verifying NEON optimization of of function Fill32f.
 
 - Tests for verifying NEON optimization of of function AlphaFilling.
 
 - Tests for verifying NEON optimization of of function CosineDistance16f.
 
 - Tests for verifying NEON optimization of of function CosineDistance32f.
 
 - Tests for verifying NEON optimization of of function Gemm32fNN.
 
 - Tests for verifying NEON optimization of of function Gemm32fNT.
 
 - Tests for verifying NEON optimization of of function FillPixel.
 
 - Tests for verifying NEON optimization of of function ReduceColor2x2.
 
 - Tests for verifying NEON optimization of of function BayerToBgra.
 
 - Tests for verifying NEON optimization of of function BayerToBgr.
 
 - Tests for verifying NEON optimization of of function TransformImage.
 
 - Tests for verifying NEON optimization of of function BgraToYuva420p.
 
 - Tests for verifying NEON optimization of of function Yuva420pToBgra.
 
 - Tests for verifying NEON optimization of of function Resizer.
 
 - Tests for verifying NEON optimization of of function HogLiteFindMax7x7.
 
 - Tests for verifying NEON optimization of of function HogLiteCreateMask.
 
 - Tests for verifying NEON optimization of of function HogLiteFilterSeparable.
 
 - Tests for verifying NEON optimization of of function HogLiteCompressFeatures.
 
 - Tests for verifying NEON optimization of of function HogLiteResizeFeatures.
 
 - Tests for verifying NEON optimization of of function HogLiteFilterFeatures.
 
 - Tests for verifying NEON optimization of of function HogLiteExtractFeatures.
 
 - Tests for verifying NEON optimization of of function Winograd2x3SetFilter.
 
 - Tests for verifying NEON optimization of of function Winograd4x3SetFilter.
 
 - Tests for verifying NEON optimization of of function Winograd2x3SetInput.
 
 - Tests for verifying NEON optimization of of function Winograd2x3SetOutput.
 
 - Tests for verifying NEON optimization of of function SynetAddBias.
 
 - Tests for verifying NEON optimization of of function SynetEltwiseLayerForward.
 
 - Tests for verifying NEON optimization of of function SynetPoolingForwardMax.
 
 - Tests for verifying NEON optimization of of function SynetFusedLayerForward0.
 
 - Tests for verifying NEON optimization of of function SynetFusedLayerForward1.
 
 - Tests for verifying NEON optimization of of function SynetFusedLayerForward2.
 
 - Tests for verifying NEON optimization of of function SynetFusedLayerForward3.
 
 - Tests for verifying NEON optimization of of function SynetFusedLayerForward4.
 
 - Tests for verifying NEON optimization of of function SynetInnerProductLayerForward.
 
 - Tests for verifying NEON optimization of of function SynetLrnLayerCrossChannels.
 
 - Tests for verifying NEON optimization of of function SynetPreluLayerForward.
 
 - Tests for verifying NEON optimization of of function SynetRestrictRange.
 
 - Tests for verifying NEON optimization of of function SynetScaleLayerForward.
 
 - Tests for verifying NEON optimization of of function SynetSoftmaxLayerForward.
 
 - Tests for verifying NEON optimization of of function ConvolutionForward.
 
 
Bug fixing
 - Error (at 32-bit OS) in test of function HogLiteFindMax7x7.
 
 
Home 
 
February 1, 2019 (version 4.2.74) 
Algorithms
New features
 - Base implementation, SSE, AVX and AVX-512F optimizations of function Winograd2x3SetFilter(NHWC mode).
 
 - Base implementation, SSE, AVX and AVX-512F optimizations of function Winograd4x3SetFilter(NHWC mode).
 
 - Base implementation, SSE, AVX and AVX-512F optimizations of function Winograd2x3SetInput(NHWC mode).
 
 - Base implementation, SSE, AVX and AVX-512F optimizations of function Winograd2x3SetOutput(NHWC mode).
 
 - Parameter gemm (a pointer to external function of matrix multiplication) in function ConvolutionInit.
 
 - Choise of the best gemm function in runtime.
 
 - SIMD_RUNTIME_GEMM_STATISTIC macro (annotation of runtime choise of gemm).
 
 - Base implementation, SSE, AVX, AVX2 and AVX-512F optimizations of function SynetPoolingForwardMax.
 
 - Base implementation, SSE, AVX and AVX-512F optimizations of function FusedLayerForward4
 
 - Base implementation, SSE2, AVX2 and AVX-512F optimizations of function SynetSoftmaxForward.
 
 - Base implementation, SSE2, AVX2 and AVX-512BW optimizations of function Yuva420pToBgra.
 
 - Base implementation, SSSE3 optimization of function TransformImage.
 
 
Improving
 - SSE, AVX, AVX2 and AVX-512F optimizations of function ConvolutionForward.
 
 
Removing
 - Function Winograd2x3iSetInput.
 
 - Function Winograd2x3iSetOutput.
 
 
Bug fixing
 - Error in AVX-512F optimization of function ConvolutionDirectHwcConvolutionBiasActivationDefault.
 
 
Test framework
New features
 - Tests for verifying functionality of function Winograd2x3SetFilter (NHWC mode).
 
 - Tests for verifying functionality of function Winograd4x3SetFilter (NHWC mode).
 
 - Tests for verifying functionality of function Winograd2x3SetInput (NHWC mode).
 
 - Tests for verifying functionality of function Winograd2x3SetOutput (NHWC mode).
 
 - Printing of internal performance statistic.
 
 - Tests for verifying functionality of function SynetPoolingForwardMax.
 
 - Tests for verifying functionality of function FusedLayerForward4.
 
 - Tests for verifying functionality of function SynetSoftmaxForward.
 
 - Tests for verifying functionality of function Yuva420pToBgra.
 
 - Tests for verifying functionality of function TransformImage.
 
 
Infrastructure
Bug fixing
 - The input variable CMAKE_CXX_FLAGS can contain invalid options (-mtune=native, -march=haswell, -mavx, etc.).
 
 
Home 
  
January 2, 2019 (version 4.2.73) 
Algorithms
New features
 - Base implementation, SSE, AVX and AVX-512F optimizations of function FusedLayerForward3.
 
 - Base implementation, SSE, AVX and AVX-512F optimizations of function ConvolutionBiasAndActivation(NHWC mode).
 
 
Improving
 - SSE, AVX, AVX2 and AVX-512F optimizations of function Gemm32fNN.
 
 - Add output parameter 'internal' to function ConvolutionSetWeight.
 
 
Bug fixing
 - Wrong assert condition in AVX-512F optimization of function NeuralRelu.
 
 - Visual Studio 2017 compiler error (intrinsic _mm512_maskz_loadu_epi8 in Release mode).
 
 - Crash: reading of unaligned memory in AVX-512BW optimization of function HogLiteFilterFeatures.
 
 - Performance bug in functions SynetAddBias, SynetFusedLayerForwardX, SynetPreluLayerForward and SynetScaleLayerForward when (count = 1, trans = 1).
 
 
Test framework
New features
 - Tests for verifying functionality of function FusedLayerForward3.
 
 
Home 
  
 2025 |
 2024 |
 2023 |
 2022 |
 2021 |
 2020 |
 2019 |
 2018 |
 2017 |
 2016 |
 2015 |
 2014 |
 2013
  
 |