Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers
9 views

BLAS/LAPACK compatibility

I've been trying to figure out whether the newer version of BLAS/LAPACK are backward compatible with the older releases but I can't find anything on the netlib website or docs. Are they compatible ...
lll's user avatar
  • 19
1 vote
1 answer
63 views

"Invalid read of size 8" warning from Valgrind when calling zhemv blas function in C++

I'm computing a hermitian (self-adjoint) matrix times a complex vector multiplication by means of ZHEMV in BLAS by calling the function from a C++ interface. The problem I see is getting an "...
Dimorga's user avatar
  • 11
1 vote
0 answers
92 views

Ifx cannot find modern generic MKL routines like GEMM_F95

I am compiling Fortran code with the ifx compiler (version 2025.0.4) on Windows. I have the Intel MKL library downloaded as well and I am trying to compile a program using it, like this: ifx test.f90 ...
FusRoDah's user avatar
  • 149
1 vote
1 answer
185 views

MKL and openBLAS interactions - a question about linking

I'm using a binary (R) that dynamically links to a generic version of BLAS, for instance (and in a lot of cases) this is openBLAS. Now, inside R, I'm dynamically loading another shared library (...
Daniel Falbel's user avatar
1 vote
2 answers
68 views

Undefined reference to cblas_* with cmake on windows

I'working on a project that uses SAF (Spatial Audio Framework) which has OpenBlas and LAPACK as Dependecies. (The Project includes a lot of libraries so I only show the code that relates to my problem:...
TheBaum's user avatar
  • 164
1 vote
0 answers
37 views

Confused about cblas_dgemm arguments

Say I want to calculate x^T * Y, x is an n by 1 matrix and Y is an n by n matrix: cblas_dgemm(const enum CBLAS_ORDER Order, const enum CBLAS_TRANSPOSE TransA, const enum CBLAS_TRANSPOSE TransB, const ...
hansoko's user avatar
  • 389
5 votes
2 answers
188 views

crossprod(m1, m2) is running slower than t(m1) %*% m2 on my machine

Why does t(mat1) %*% mat2 work quicker than crossprod(mat1, mat2). Isn't the whole point of the latter that it calls a more efficient low-level routine? r$> mat1 <- array(rnorm(100 * 600), dim = ...
Turdle's user avatar
  • 53
5 votes
1 answer
94 views

How to control (BLAS?) parallelization when using mgcv::gam

I am running some fairly large gam models and don't want to parallelize the computations, or at least want to be able to control the degree of parallelization. (Besides not wanting to fry my machine ...
Ben Bolker's user avatar
2 votes
1 answer
66 views

Parallelize operations on arrays and merge results into one array using OpenMP

I am trying to speed up a function that, given a complex-valued array arr with n entries, calculates the sum of m operations on that array using BLAS routines. Finally, it replaces the values of arr. ...
Timo59's user avatar
  • 33
0 votes
0 answers
83 views

Unexpected behaviour of matmul when compiled with blas in Fortran

I am trying to benchmark the blas routines dgemv and dgemm in Fortran. For that I have written this simple codes: matmul.f90: program test ...
pablo's user avatar
  • 69
1 vote
0 answers
88 views

How to use BLAS in C, using gcc on Linux?

On Linux, in the file a.c, I do #include <cblas.h> and later I do cblas_sgemm(...). Compiling with gcc -O2 -march=native -fopenmp a.c or with gcc -O2 -march=native -lblas -fopenmp a.c results in ...
Sasha's user avatar
  • 371
0 votes
0 answers
26 views

Is BLAS interface of cvxopt different from standard ones?

According to the official documentation, the description of BLAS routine tbmv seems no different from the standard routines, for example found in the Intel's MKL Manual. However, running the ...
PLE's user avatar
  • 109
0 votes
0 answers
14 views

Reason behind transposition restrictions in the BLAS interface

I wonder if there is some reason behind the restrictions in the BLAS interface regarding transposition. Unlike gemm, not all routines allow all combinations of transpositions of the input matrices. ...
Rasmus's user avatar
  • 161
0 votes
1 answer
138 views

Problems evaluating CUDNN for SGEMM

I used cudnn to test sgemm for C[stride x stride] = A[stride x stride] x B[stride x stride] below, Configuration GPU: T1000/SM_75 cuda-12.0.1/driver-535 installed (via the multiverse repos on ubuntu-...
sof's user avatar
  • 9,689
0 votes
0 answers
88 views

How can I use multithreaded BLAS from a single threaded EIgen C++ application?

I'm trying to speed up Eigen dense matrix * matrix operation by using multihreaded BLAS library calls. I've achieved 100% speed increase using AMD AOCL-BLAS library from within Eigen. But I seem ...
Pavel Fantys's user avatar

15 30 50 per page
1
2 3 4 5
63