2024 |
2023 |
2022 |
2021 |
2020 |
2019 |
2018 |
2017 |
2016 |
2015 |
2014 |
2013
December 2, 2019 (version 4.4.84)
Algorithms
New features
- Method View::Clear.
- Parameter makeCopy in method ShiftDetector::SetBackground.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetPoolingForwardAverage.
Improving
- SSE, AVX, AVX2, AVX-512F and NEON optimizations of Convolution32f framework.
Bug fixing
- Crash when defined SIMD_PERFORMANCE_STATISTIC.
- Compiler warning in SSSE3 and AVX2 optimizations of Resizer.
- Error in base implementation of function SquaredDifferenceKahanSum32f (Visual Studio 2019).
Test framework
New features
- Tests for verifying functionality of function SynetPoolingForwardAverage.
Home
November 1, 2019 (version 4.4.83)
Algorithms
New features
- Base implementation, SSE4.1, AVX2, AVX-512BW and NEON optimizations of function SynetSetInput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetHswish32f.
- Support of Hswish activation function in Convolution32f framework.
- Support of Hswish activation function in MergedConvolution32f framework.
- Support of Hswish activation function in Deconvolution32f framework.
- Support of 5x5 and 7x7 depthwise convolution in the middle layer of MergedConvolution32f framework.
- Base implementation, SSE, AVX, AVX-512BW and NEON optimizations of function SynetShuffleLayerForward.
- Base implementation, SSE2, AVX2, AVX-512BW and NEON optimizations of function GetObjectMoments.
Improving
- SSE2, AVX2, AVX-512BW and NEON optimizations of function GetObjectMoments.
- NEON optimization of function Gemm32fNN.
- NEON optimization of function Gemm32fNT.
- NEON optimization of Convolution32f framework.
- NEON optimization of MergedConvolution32f framework.
- NEON optimization of Deconvolution32f framework.
Renaming
- Function from SynetRestrictRange to SynetRestrictRange32f.
Bug fixing
- GCC-4.9 compiler error in function Base::CpuCacheSize.
- Error in SSE2 optimization of Resizer framework.
Test framework
New features
- Tests for verifying functionality of function SynetSetInput.
- Tests for verifying functionality of function SynetHswish32f.
- Tests for verifying functionality of function SynetShuffleLayerForward.
- Tests for verifying functionality of function GetObjectMoments.
Infrastructure
Bug fixing
- Missing of file Prop.props for Microsoft Visual Studio 2019.
Home
October 1, 2019 (version 4.4.82)
Algorithms
New features
- View::Clone method (it creates clone on the base of external buffer).
- Function Simd::PrintInfo.
- SynetDeconvolution32f Framework.
- Base implementation, SSE2, AVX, AVX2, AVX-512F and NEON optimizations of SynetDeconvolution32fGemmNN class.
- Base implementation, SSE2, AVX, AVX2, AVX-512F and NEON optimizations of SynetDeconvolution32fNhwcDirect2x2 class.
Improving
- Now CpuInfo gets L1D, L2, L3 cache sizes, numbers of sockets, cpus and threads.
Renaming
- Function from ConvolutionInit to SynetConvolution32fInit.
- Function from ConvolutionExternalBufferSize to SynetConvolution32fExternalBufferSize.
- Function from ConvolutionInternalBufferSize to SynetConvolution32fInternalBufferSize.
- Function from ConvolutionSetParams to SynetConvolution32fSetParams.
- Function from ConvolutionForward to SynetConvolution32fForward.
- Function from MergedConvolutionInit to SynetMergedConvolution32fInit.
- Function from MergedConvolutionExternalBufferSize to SynetMergedConvolution32fExternalBufferSize.
- Function from MergedConvolutionInternalBufferSize to SynetMergedConvolution32fInternalBufferSize.
- Function from MergedConvolutionSetParams to SynetMergedConvolution32fSetParams.
- Function from MergedConvolutionForward to SynetMergedConvolution32fForward.
Bug fixing
- Error in Resizer framework (in file SimdBaseResizer.cpp).
Test framework
New features
- Tests for verifying functionality of SynetDeconvolution32f Framework.
Infrastructure
New features
- Project files for Microsoft Visual Studio 2019.
Bug fixing
- Some Microsoft Visual Studio project properties can cause program crash at old CPUs.
- Using of AVX512 property instead of SIMD_AVX512 in CMakeLists.txt.
Home
September 2, 2019 (version 4.3.81)
Algorithms
New features
- SimdTensorFormatNchwXc and SimdTensorFormatOyxiXo types in SimdTensorFormatType enumeration.
- Function SynetSpecifyTensorFormat.
- Function SynetTensorAlignment.
- Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetAddBias.
- Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetScaleLayerForward.
- Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward0.
- Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward1.
- Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward2.
- Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward3.
- Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward4.
- Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward8.
- Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetFusedLayerForward9.
- Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetLrnLayerCrossChannels.
- Support of NCHW4c, NCHW8c, NCHW16c formats in function SynetPreluLayerForward.
- Support of P2(pgm) and P3(ppm) image formats in View::Load.
- Base implementation, SSE2, AVX2, AVX-512F and NEON optimizations of function SynetElu32f.
- Support of Elu activation function in Convolution framework.
- Support of Elu activation function in MergedConvolution framework.
- New meaning of add parameter in MergedConvolution framework.
Improving
- Performance measurement in Convolution and MergedConvolution frameworks.
Bug fixing
- Error in function Convert (in file SimdFrame.hpp).
- Error in function MergedConvolutionForward.
Test framework
New features
- Tests for verifying functionality of function SynetAddBias for NCHW4c, NCHW8c, NCHW16c tensor formats.
- Tests for verifying functionality of function SynetScaleLayerForward for NCHW4c, NCHW8c, NCHW16c tensor formats.
- Tests for verifying functionality of function SynetFusedLayerForward0 for NCHW4c, NCHW8c, NCHW16c tensor formats.
- Tests for verifying functionality of function SynetFusedLayerForward1 for NCHW4c, NCHW8c, NCHW16c tensor formats.
- Tests for verifying functionality of function SynetFusedLayerForward2 for NCHW4c, NCHW8c, NCHW16c tensor formats.
- Tests for verifying functionality of function SynetFusedLayerForward3 for NCHW4c, NCHW8c, NCHW16c tensor formats.
- Tests for verifying functionality of function SynetFusedLayerForward4 for NCHW4c, NCHW8c, NCHW16c tensor formats.
- Tests for verifying functionality of function SynetFusedLayerForward8 for NCHW4c, NCHW8c, NCHW16c tensor formats.
- Tests for verifying functionality of function SynetFusedLayerForward9 for NCHW4c, NCHW8c, NCHW16c tensor formats.
- Tests for verifying functionality of function SynetLrnLayerCrossChannels for NCHW4c, NCHW8c, NCHW16c tensor formats.
- Tests for verifying functionality of function SynetPreluLayerForward for NCHW4c, NCHW8c, NCHW16c tensor formats.
- Base implementation, SSE2, AVX2, AVX-512F and NEON optimizations of function SynetElu32f.
Infrastructure
Renaming
- Parameter from AVX512 to SIMD_AVX512 in CMakeLists.txt.
- Parameter from PRINT_INFO to SIMD_INFO in CMakeLists.txt.
Home
August 1, 2019 (version 4.3.80)
Algorithms
New features
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetFusedLayerForward8.
- Partial batch merging in Convolution algorithm (Winograd and GemmNN methods).
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function Winograd3x3SetFilter.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function Winograd3x3SetInput.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function Winograd3x3SetOutput.
- Winograd3x3 method in Convolution algorithm.
- Runtime choice of best micro kernel in Convolution Framework (GemmNN and Winograd methods).
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetFusedLayerForward9.
- SimdTensorFormatType enumeration.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetConvertImage.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function SynetConvertFilter.
Improving
- Performance profiling.
- SSE, AVX, AVX2, AVX-512F and NEON optimizations of MergedConvolution framework.
- SSE, AVX, AVX2, AVX-512F and NEON optimizations of Convolution Framework (GemmNN and Winograd methods).
Bug fixing
- Error in Convolution Framework (GemmNN method).
- Low performance of NEON optimization in Convolution Framework (GemmNN and Winograd methods).
- Crash in base implementation of in functions FillPixel, FillBgra, FillUv (GCC, -O3).
Test framework
New features
- Tests for verifying functionality of function SynetFusedLayerForward8.
- Tests for verifying functionality of function Winograd3x3SetFilter.
- Tests for verifying functionality of function Winograd3x3SetInput.
- Tests for verifying functionality of function Winograd3x3SetOutput.
- Special complex tests for verifying functionality of functions Winograd2x3SetFilter, Winograd2x3SetInput and Winograd2x3SetOutput.
- Special complex tests for verifying functionality of functions Winograd3x3SetFilter, Winograd3x3SetInput and Winograd3x3SetOutput.
- Special complex tests for verifying functionality of functions Winograd4x3SetFilter, Winograd4x3SetInput and Winograd4x3SetOutput.
- Tests for verifying functionality of function SynetFusedLayerForward9.
- Tests for verifying functionality of function SynetConvertImage.
- Tests for verifying functionality of function SynetConvertFilter.
Infrastructure
New features
- SIMD_PERF parameter in CMakeLists.txt.
Bug fixing
- Visual Studio project build error (in file GetVersion.cmd).
Home
July 2, 2019 (version 4.3.79)
Algorithms
New features
- Additional macros for performance profiling.
- Add function SimdPerformanceStatistic.
- Base implementation, SSE, AVX, AVX2 AVX-512F and NEON optimizations of Convolution framework (NhwcDirect mode).
Improving
- SSE, AVX, AVX2, AVX-512F and NEON optimizations of MergedConvolution framework.
Bug fixing
- Error in function MergedConvolution::SetSize (Merged Convolution Framework).
Home
June 3, 2019 (version 4.3.78)
Algorithms
New features
- SimdConvolutionParameters structure.
- Base implementation, SSE, AVX, AVX2, AVX-512F and NEON optimizations of MergedConvolution framework (version 2).
- Base implementation, AVX2 optimizations of function AbsDifference.
- SSSE3 and NEON optimizations of function TransformImage (TransformTransposeRotate0 transformation).
Bug fixing
- Error in Convolution framework (group != 1, NHWC mode).
Test framework
New features
- Tests for verifying functionality of function AbsDifference.
Bug fixing
- Compiler error in file TestResize.cpp (aarch64 toolchain).
Home
May 2, 2019 (version 4.3.77)
Algorithms
New features
- Base implementation, SSE2, AVX2, AVX-512F and NEON optimizations of function SynetLrnLayerCrossChannels(NHWC mode).
- Base implementation, SSSE3, AVX2, AVX-512BW and NEON optimizations of function BgrToRgb.
- Pixel::Rgb24 structure.
- Base implementation of Resizer framework (area method, byte type).
- SSE2, SSSE3, AVX2, AVX-512BW and NEON optimizations of Resizer framework (bilinear method, byte type).
- SSE2, SSE4.1, AVX2, AVX-512BW and NEON optimizations of Resizer framework (area method, byte type).
- Simd::Resize function.
- Base implementation, SSE, AVX, AVX2 and AVX-512F optimizations of MergedConvolution framework.
Improving
- AVX-512F optimization of Convolution framework.
Bug fixing
- Error in SSE, AVX, AVX-512F and NEON optimizations of function Fill32f.
- Out of range in SSE4.1, AVX2, AVX-512BW and NEON optimizations of functions DetectionHaarDetect32fp and DetectionHaarDetect32fi.
- Out of range in SSE4.1, AVX2, AVX-512BW and NEON optimizations of functions DetectionLbpDetect32fp, DetectionLbpDetect32fi, DetectionLbpDetect16ip and DetectionLbpDetect16ii.
- Error in AVX2, AVX-512BW and NEON optimizations of function CosineDistancesMxNa16f.
- Error in AVX-512F optimization of function Convolution framework.
- Error in SSE, AVX, AVX2, AVX-512F and NEON optimizations of Convolution framework (NHWC mode, depthwise convolution).
- Error in AVX-512F optimization of Convolution framework (NHWC mode, winograd2x3 method).
- Error in AVX-512F optimization of Convolution framework (function KernelHwcDefaultBody8).
Test framework
New features
- Tests for verifying functionality of function SynetLrnLayerCrossChannels (NHWC mode).
- Tests for verifying functionality of function BgrToRgb.
- Tests for verifying functionality of MergedConvolution framework.
Infrastructure
Bug fixing
- Compiler warning for GCC >= 7.0 (ARM target).
Home
April 1, 2019 (version 4.3.76)
Algorithms
New features
- Base implementation, AVX2, AVX-512BW and NEON optimizations of function CosineDistancesMxNa16f.
- Macro SIMD_FUTURE_DISABLE.
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function Winograd4x3SetInput(NHWC mode).
- Base implementation, SSE, AVX, AVX-512F and NEON optimizations of function Winograd4x3SetOutput(NHWC mode).
- Support of Winograd4x3 for Convolution framework (NHWC mode).
- Parameter 'batch' in Convolution framework.
- Function ConvolutionInternalBufferSize.
- Use parameter trans instead of parameters srcT and dstT in function ConvolutionInit.
Improving
- ConvolutionGemmNN method of Convolution framework (NHWC mode).
- ConvolutionWinograd method of Convolution framework (NHWC mode).
Renaming
- Function from GetFlushToZero to GetFastMode.
- Function from SetFlushToZero to SetFastMode.
- Function from ConvolutionBufferSize to ConvolutionExternalBufferSize.
Bug fixing
- Compiler error (using of name 'small' which can be system macro) in file SimdSse2Statistic.cpp.
- Compiler warning (unused variable) in function Neon::SetFlushToZero.
- Compiler warning (unused variable) in function Base::ConvolutionBiasAndActivation.
- Compiler error (Visual Studio for Android) in file SimdSsse3Transform.cpp.
- Compiler error (Visual Studio for Android) in function SimdCosineDistance16f.
- Low performance of function SimdSquaredDifferenceSum16f.
- Compiler warning (unused variable) in function Neon::AlphaFilling.
- Compiler warning (unused variable) in function Neon::Fill32f.
- Compiler warning (wrong initialization order) in file SimdNeonGemm32f.cpp.
- Compiler warning (unused variable) in function Neon::SynetInnerProductLayerForward.
- Compiler warning (unused variable) in file TestConvolution.cpp.
- Compiler internal error (G++ 6.3.0) in function Neon::BgrToBgra.
- Compiler error (aarch64) in functions Neon::GetFlushToZero and Neon::SetFlushToZero.
- Error in NEON optimization of function HogLiteFindMax7x7.
- Denormals performance bug.
- Error in NEON optimization of function ReduceGray2x2.
Test framework
New features
- Tests for verifying functionality of function CosineDistancesMxNa16f.
- Tests for verifying functionality of function Winograd4x3SetInput (NHWC mode).
- Tests for verifying functionality of function Winograd4x3SetOutput (NHWC mode).
Bug fixing
- Compiler error (Visual Studio for Android) in file TestFloat16.cpp.
- Compiler warning (wrong initialization order) in file SimdNeonGemm32f.cpp.
- Compiler internal error (G++ 4.9).
Home
March 7, 2019 (version 4.3.75)
Algorithms
New features
- Base implementation, SSE2, SSSE3, AVX2 and AVX-512BW optimizations of function BgraToYuva420p.
- NEON optimization of function NeuralSigmoid.
- NEON optimization of function NeuralTanh.
- NEON optimization of function NeuralPow.
- NEON version of functions GetFlushToZero and SetFlushToZero.
- NEON optimization of function Fill32f.
- NEON optimization of function AlphaFilling.
- NEON optimization of function CosineDistance16f.
- NEON optimization of function CosineDistance32f.
- NEON optimization of function Gemm32fNN.
- NEON optimization of function Gemm32fNT.
- NEON optimization of function FillPixel.
- NEON optimization of function ReduceColor2x2.
- NEON optimization of function BayerToBgra.
- NEON optimization of function BayerToBgr.
- NEON optimization of function TransformImage.
- NEON optimization of function BgraToYuva420p.
- NEON optimization of function Yuva420pToBgra.
- NEON optimization of function Resizer.
- NEON optimization of function HogLiteFindMax7x7.
- NEON optimization of function HogLiteCreateMask.
- NEON optimization of function HogLiteFilterSeparable.
- NEON optimization of function HogLiteCompressFeatures.
- NEON optimization of function HogLiteResizeFeatures.
- NEON optimization of function HogLiteFilterFeatures.
- NEON optimization of function HogLiteExtractFeatures.
- NEON optimization of function Winograd2x3SetFilter.
- NEON optimization of function Winograd4x3SetFilter.
- NEON optimization of function Winograd2x3SetInput.
- NEON optimization of function Winograd2x3SetOutput.
- NEON optimization of function SynetAddBias.
- NEON optimization of function SynetEltwiseLayerForward.
- NEON optimization of function SynetPoolingForwardMax.
- NEON optimization of function SynetFusedLayerForward0.
- NEON optimization of function SynetFusedLayerForward1.
- NEON optimization of function SynetFusedLayerForward2.
- NEON optimization of function SynetFusedLayerForward3.
- NEON optimization of function SynetFusedLayerForward4.
- NEON optimization of function SynetInnerProductLayerForward.
- NEON optimization of function SynetLrnLayerCrossChannels.
- NEON optimization of function SynetPreluLayerForward.
- NEON optimization of function SynetRestrictRange.
- NEON optimization of function SynetScaleLayerForward.
- NEON optimization of function SynetSoftmaxLayerForward.
- NEON optimization of function ConvolutionForward.
Improving
- AVX, AVX2 and AVX-512F optimizations of function ConvolutionForward.
- SSE, AVX, AVX2 and AVX-512F optimizations of function Resizer.
Bug fixing
- Error in AVX-512BW optimization of function ChangeColors.
- Error in AVX-512BW optimization of function NormalizeHistogram.
- Error in AVX-512F optimization of function NeuralConvolutionForward.
- Error in NEON optimization of function Uint8ToFloat32.
- Error in NEON optimization of function SquaredDifferenceSum16f.
- Error in SSE version of functions GetFlushToZero.
- Error in Base implementation of function SynetFusedLayerForward0.
Test framework
New features
- Tests for verifying functionality of function BgraToYuva420p.
- Tests for verifying NEON optimization of of function NeuralSigmoid.
- Tests for verifying NEON optimization of of function NeuralTanh.
- Tests for verifying NEON optimization of of function NeuralPow.
- Tests for verifying NEON optimization of of function Fill32f.
- Tests for verifying NEON optimization of of function AlphaFilling.
- Tests for verifying NEON optimization of of function CosineDistance16f.
- Tests for verifying NEON optimization of of function CosineDistance32f.
- Tests for verifying NEON optimization of of function Gemm32fNN.
- Tests for verifying NEON optimization of of function Gemm32fNT.
- Tests for verifying NEON optimization of of function FillPixel.
- Tests for verifying NEON optimization of of function ReduceColor2x2.
- Tests for verifying NEON optimization of of function BayerToBgra.
- Tests for verifying NEON optimization of of function BayerToBgr.
- Tests for verifying NEON optimization of of function TransformImage.
- Tests for verifying NEON optimization of of function BgraToYuva420p.
- Tests for verifying NEON optimization of of function Yuva420pToBgra.
- Tests for verifying NEON optimization of of function Resizer.
- Tests for verifying NEON optimization of of function HogLiteFindMax7x7.
- Tests for verifying NEON optimization of of function HogLiteCreateMask.
- Tests for verifying NEON optimization of of function HogLiteFilterSeparable.
- Tests for verifying NEON optimization of of function HogLiteCompressFeatures.
- Tests for verifying NEON optimization of of function HogLiteResizeFeatures.
- Tests for verifying NEON optimization of of function HogLiteFilterFeatures.
- Tests for verifying NEON optimization of of function HogLiteExtractFeatures.
- Tests for verifying NEON optimization of of function Winograd2x3SetFilter.
- Tests for verifying NEON optimization of of function Winograd4x3SetFilter.
- Tests for verifying NEON optimization of of function Winograd2x3SetInput.
- Tests for verifying NEON optimization of of function Winograd2x3SetOutput.
- Tests for verifying NEON optimization of of function SynetAddBias.
- Tests for verifying NEON optimization of of function SynetEltwiseLayerForward.
- Tests for verifying NEON optimization of of function SynetPoolingForwardMax.
- Tests for verifying NEON optimization of of function SynetFusedLayerForward0.
- Tests for verifying NEON optimization of of function SynetFusedLayerForward1.
- Tests for verifying NEON optimization of of function SynetFusedLayerForward2.
- Tests for verifying NEON optimization of of function SynetFusedLayerForward3.
- Tests for verifying NEON optimization of of function SynetFusedLayerForward4.
- Tests for verifying NEON optimization of of function SynetInnerProductLayerForward.
- Tests for verifying NEON optimization of of function SynetLrnLayerCrossChannels.
- Tests for verifying NEON optimization of of function SynetPreluLayerForward.
- Tests for verifying NEON optimization of of function SynetRestrictRange.
- Tests for verifying NEON optimization of of function SynetScaleLayerForward.
- Tests for verifying NEON optimization of of function SynetSoftmaxLayerForward.
- Tests for verifying NEON optimization of of function ConvolutionForward.
Bug fixing
- Error (at 32-bit OS) in test of function HogLiteFindMax7x7.
Home
February 1, 2019 (version 4.2.74)
Algorithms
New features
- Base implementation, SSE, AVX and AVX-512F optimizations of function Winograd2x3SetFilter(NHWC mode).
- Base implementation, SSE, AVX and AVX-512F optimizations of function Winograd4x3SetFilter(NHWC mode).
- Base implementation, SSE, AVX and AVX-512F optimizations of function Winograd2x3SetInput(NHWC mode).
- Base implementation, SSE, AVX and AVX-512F optimizations of function Winograd2x3SetOutput(NHWC mode).
- Parameter gemm (a pointer to external function of matrix multiplication) in function ConvolutionInit.
- Choise of the best gemm function in runtime.
- SIMD_RUNTIME_GEMM_STATISTIC macro (annotation of runtime choise of gemm).
- Base implementation, SSE, AVX, AVX2 and AVX-512F optimizations of function SynetPoolingForwardMax.
- Base implementation, SSE, AVX and AVX-512F optimizations of function FusedLayerForward4
- Base implementation, SSE2, AVX2 and AVX-512F optimizations of function SynetSoftmaxForward.
- Base implementation, SSE2, AVX2 and AVX-512BW optimizations of function Yuva420pToBgra.
- Base implementation, SSSE3 optimization of function TransformImage.
Improving
- SSE, AVX, AVX2 and AVX-512F optimizations of function ConvolutionForward.
Removing
- Function Winograd2x3iSetInput.
- Function Winograd2x3iSetOutput.
Bug fixing
- Error in AVX-512F optimization of function ConvolutionDirectHwcConvolutionBiasActivationDefault.
Test framework
New features
- Tests for verifying functionality of function Winograd2x3SetFilter (NHWC mode).
- Tests for verifying functionality of function Winograd4x3SetFilter (NHWC mode).
- Tests for verifying functionality of function Winograd2x3SetInput (NHWC mode).
- Tests for verifying functionality of function Winograd2x3SetOutput (NHWC mode).
- Printing of internal performance statistic.
- Tests for verifying functionality of function SynetPoolingForwardMax.
- Tests for verifying functionality of function FusedLayerForward4.
- Tests for verifying functionality of function SynetSoftmaxForward.
- Tests for verifying functionality of function Yuva420pToBgra.
- Tests for verifying functionality of function TransformImage.
Infrastructure
Bug fixing
- The input variable CMAKE_CXX_FLAGS can contain invalid options (-mtune=native, -march=haswell, -mavx, etc.).
Home
January 2, 2019 (version 4.2.73)
Algorithms
New features
- Base implementation, SSE, AVX and AVX-512F optimizations of function FusedLayerForward3.
- Base implementation, SSE, AVX and AVX-512F optimizations of function ConvolutionBiasAndActivation(NHWC mode).
Improving
- SSE, AVX, AVX2 and AVX-512F optimizations of function Gemm32fNN.
- Add output parameter 'internal' to function ConvolutionSetWeight.
Bug fixing
- Wrong assert condition in AVX-512F optimization of function NeuralRelu.
- Visual Studio 2017 compiler error (intrinsic _mm512_maskz_loadu_epi8 in Release mode).
- Crash: reading of unaligned memory in AVX-512BW optimization of function HogLiteFilterFeatures.
- Performance bug in functions SynetAddBias, SynetFusedLayerForwardX, SynetPreluLayerForward and SynetScaleLayerForward when (count = 1, trans = 1).
Test framework
New features
- Tests for verifying functionality of function FusedLayerForward3.
Home
2024 |
2023 |
2022 |
2021 |
2020 |
2019 |
2018 |
2017 |
2016 |
2015 |
2014 |
2013
|