#
openmp
Here are 1,194 public repositories matching this topic...
oneAPI Deep Neural Network Library (oneDNN)
library
performance
deep-neural-networks
deep-learning
cpp
processor
opencl
x64
x86-64
openmp
avx2
amx
sse41
tbb
aarch64
avx512
intel-openmp-runtime
bfloat16
oneapi
onednn
dpcpp
xe-architecture
-
Updated
Jul 22, 2021 - C++
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
iot
machine-learning
nim
deep-learning
opencl
linear-algebra
automatic-differentiation
openmp
parallel-computing
cuda
autograd
gpgpu
neural-networks
high-performance-computing
ndarray
tensor
gpu-computing
multidimensional-arrays
cudnn
matrix-library
-
Updated
Jul 21, 2021 - Nim
Kratos Multiphysics (A.K.A Kratos) is a framework for building parallel multi-disciplinary simulation software. Modularity, extensibility and HPC are the main objectives. Kratos has BSD license and is written in C++ with extensive Python interface.
python
c-plus-plus
multi-platform
openmp
mpi
parallel-computing
fem
bsd-license
numerical-methods
multiphysics
dem
kratos
kratos-multiphysics
-
Updated
Jul 25, 2021 - C++
stdgpu: Efficient STL-like Data Structures on the GPU
cpp
gpu
modern-cpp
cpp14
openmp
cuda
stl
data-structures
gpgpu
gpu-acceleration
cpp17
stl-containers
hip
gpu-computing
rocm
cpp20
stl-like
-
Updated
Jul 23, 2021 - C++
Extended Memory Semantics - Persistent shared object memory and parallelism for Node.js and Python
javascript
python
json
json-data
parallel
openmp
multithreading
persistent-data-structure
non-volatile-memory
persistent-memory
persistent-data
shared-memory
ems
extended-memory-semantics
-
Updated
Aug 7, 2020 - JavaScript
High-performance stateful serverless runtime based on WebAssembly
-
Updated
Jul 23, 2021 - C++
OptimLib: a lightweight C++ library of numerical optimization methods for nonlinear functions
newton
cpp
optimization
eigen
openmp
cpp11
evolutionary-algorithms
armadillo
gradient-descent
optim
differential-evolution
optimization-algorithms
particle-swarm-optimization
bfgs
lbfgs
openmp-parallelization
numerical-optimization-methods
-
Updated
Sep 14, 2020 - C++
C++ library for solving large sparse linear systems with algebraic multigrid method
c-plus-plus
cpp
opencl
openmp
mpi
cuda
gpgpu
scientific-computing
amg
sparse-linear-systems
multigrid
linear-solvers
-
Updated
Jul 21, 2021 - C++
Armadillo: fast C++ library for linear algebra & scientific computing - http://arma.sourceforge.net
machine-learning
statistics
hpc
matlab
vector
matrix
linear-algebra
solver
openmp
cpp11
matrix-factorization
matrix-functions
scientific-computing
expression-template
gaussian-mixture-models
armadillo
blas
lapack
linear-algebra-library
sparse-matrix
-
Updated
Jul 22, 2021
A state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead
runtime
scheduler
openmp
multithreading
parallelism
task-scheduler
message-passing
threadpool
data-parallelism
fork-join
work-stealing
task-parallelism
-
Updated
Jul 4, 2021 - Nim
A C++ header-only library of statistical distribution functions.
statistics
cpp
constexpr
probability
eigen
openmp
quantile
cpp11
stats
armadillo
cdf
numerical-methods
blaze
distributions
eigen3
armadillo-library
density-functions
quantile-functions
-
Updated
Mar 14, 2021 - C++
This is a set of simple programs that can be used to explore the features of a parallel platform.
c
c-plus-plus
travis-ci
julia
opencl
boost
openmp
mpi
parallel-computing
python3
pgas
coarray-fortran
threading
tbb
kokkos
shmem
charmplusplus
sycl
parallel-programming
fortran2008
-
Updated
Jun 28, 2021 - C
A modern C++ BVH construction and traversal library
-
Updated
Jun 22, 2021 - C++
-
Updated
Jul 21, 2021 - C++
Dive into machine learning system, start from reinventing the wheel.
-
Updated
Jun 18, 2018 - C++
Crack legacy zip encryption with Biham and Kocher's known plaintext attack.
-
Updated
May 22, 2021 - C++
-
Updated
Jun 26, 2021 - C++
The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
deep-learning
assembler
parallel
openmp
jit
simd
matrix-multiplication
high-performance-computing
blas
convolution
tensor
compiler-optimization
gemm
runtime-cpu-detection
-
Updated
Feb 26, 2021 - Nim
Fast inference engine for Transformer models
deep-neural-networks
cpp
neon
openmp
parallel-computing
cuda
avx
intrinsics
avx2
neural-machine-translation
opennmt
quantization
gemm
mkl
thrust
transformer-models
onednn
-
Updated
Jul 22, 2021 - C++
Efficient monocular visual odometry for ground vehicles on ARM processors
-
Updated
Mar 25, 2021 - C++
python
c
openmp
avx
simd
cosmology
astrophysics
galaxies
large-scale-structure
pair-counting
intrinsics
avx2
avx512
sse42
correlation-functions
-
Updated
Jul 22, 2021 - C
monolish: MONOlithic LIner equation Solvers for Highly-parallel architecture
cpu
hpc
gpu
matrix
linear-algebra
cpp14
openmp
cuda
scientific-computing
blas
lapack
linear-algebra-library
sparse-matrix
mkl
cpp17-library
matrix-structures
-
Updated
Jul 9, 2021 - C++
Ytk-mp4j is a fast, user-friendly, cross-platform, multi-process, multi-thread collective message passing java library which includes gather, scatter, allgather, reduce-scatter, broadcast, reduce, allreduce communications for distributed machine learning.
-
Updated
Jun 14, 2017 - Java
A C++ library to compute neighborhood information for point clouds within a fixed radius. Suitable for many applications, e.g. neighborhood search for SPH fluid simulations.
-
Updated
Nov 19, 2020 - C++
Improve this page
Add a description, image, and links to the openmp topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the openmp topic, visit your repo's landing page and select "manage topics."
Our users are often confused by the output from programs such as zip2john sometimes being very large (multi-gigabyte). Maybe we should identify and enhance these programs to output a message to stderr to explain to users that it's normal for the output to be very large - maybe always or maybe only when the output size is above a threshold (e.g., 1 million bytes?)