Newest 'blas performance linear-algebra' Questions

0 votes

1 answer

234 views

Avoid blas when involving temporary memory allocation?

I have a program that computes the matrix product x'Ay repeatedly. Is it better practice to compute this by making calls to MKL's blas, i.e. cblas_dgemv and cblas_ddot, which requires allocating ...

Agrim Pathak

3,207

asked Jul 3, 2016 at 1:47

24 votes

2 answers

21k views

Link ATLAS/MKL to an installed Numpy

TL;DR how to link ATLAS/MKL to existing Numpy without rebuilding. I have used Numpy to calculate with the large matrix and I found that it is very slow because Numpy only use 1 core to do calculation....

tndoan

653

asked Feb 10, 2014 at 7:12

0 votes

1 answer

505 views

Efficient implementation of indirect daxpy operation

_axpy is a blas level one operation which implements following for i = 1:n a[i] = a[i]-$\alpha$ b[i] There are efficient implementation of such regular daxpy available through various blas ...

arbitUser1401

575

asked Dec 19, 2013 at 17:34

2 votes

0 answers

300 views

Strange performance issue with AMD's ACML BLAS/LAPACK library

I asked this question over at the AMD developers forum a few days ago, but haven't gotten an answer. Maybe someone here has some insight. http://devgurus.amd.com/thread/167492 I am running ACML ...

mrip

15.2k

asked Sep 24, 2013 at 16:23

Collectives™ on Stack Overflow

All Questions