LIBXSMM: A High Performance Library for Small Matrix Multiplications. LIBXSMM is a library for small dense and small sparse matrix-matrix multiplications as well as for deep learning primitives such as small convolutions targeting Intel Architecture. Small matrix multiplication kernels are generated for the following instruction set extensions: Intel SSE, Intel AVX, Intel AVX2, IMCI (KNCni) for Intel Xeon Phi coprocessors (”KNC”), and Intel AVX‑512 as found in the Intel Xeon Phi processor family (Knights Landing ”KNL”, Knights Mill ”KNM”) and Intel Xeon processors (Skylake-SP ”SKX”). Historically small matrix multiplications were only optimized for the Intel Many Integrated Core Architecture ”MIC”) using intrinsic functions, meanwhile optimized assembly code is targeting all afore mentioned instruction set extensions (static code generation), and Just‑In‑Time (JIT) code generation is targeting Intel AVX and beyond. Optimized code for small convolutions is JIT-generated for Intel AVX2 and Intel AVX‑512.

