All Questions
Tagged with blas linear-algebra
53 questions
1
vote
0
answers
251
views
Change the BLAS version used by R
I am currently having issues with the 'Eigen()' function in R.
It was mentioned that I should try using 'OpenBLAS'. How abouts should I go about installing this and make R use this version of BLAS.
I ...
1
vote
1
answer
243
views
How to force Julia to use multiple threads for matrix multiplication?
I want to find powers of a relatively small matrix, but this matrix consists of rational numbers of type Rational{BigInt}. By default, Julia utilizes only a single thread for such computations. I want ...
0
votes
1
answer
105
views
Why multiplying wide matrices are slower than square matrices?
I have noticed the following while trying to increase the performance of my code:
>>> a, b = torch.randn(1000,1000), torch.randn(1000,1000)
>>> c, d = torch.randn(10000, 100), torch....
0
votes
1
answer
172
views
What is wrong with my sparse matrix-multiple vectors (SpMM) product function for CSR?
I have the following code for the sparse matrix-vector (SpMV) product in C assuming a CSR storage format:
void dcsrmv(SparseMatrixCSR *A, double *x, double *y) {
for (int i=0; i<A->m; i++) {
...
-1
votes
1
answer
136
views
LAPACK different outputs when using solvers for banded matricies
I've been stuck at this for hours and hoping someone can figure out what I am missing. I am solving Ax=B firstly using DGESV which I am 99% sure is correct. Then I am puting A into a banded form and ...
1
vote
0
answers
37
views
Optimize multiple weighted matrix multiplications summations
I have to calculate the linear sum of multiple matrix multiplications, i.e.,
F = A1 * X@Y + A2 * X@Z + A3 * Y@Z
Where X, Y and Z are large matrices and A1, A2 and A3 are scalars.
I can calculate them ...
0
votes
1
answer
582
views
What is the time complexity of Trsm and other BLAS operations?
I am accelerating a model by replacing all its linear algebra operations with cuBlas's functions. And I want to get the time complexity or FLOPs of the model to evaluate its performance in roofline ...
1
vote
0
answers
212
views
Does TensorFlow and PyTorch have any specialized matrix multiplication operation for Symmetric or Triangular matrices?
Does TensorFlow and PyTorch have any specialized functions for multipling matrices with special properties?
For example, consider the matrix multiplication:
C := AB where A and B are n x n
The cost ...
0
votes
1
answer
54
views
how can I call dgemm in Excel VBA?
How can I call blas dgemm in Excel VBA? My blas is Rblas.dll from an R installation.
If I put a breakpoint before the subroutine ends and go to the immediate window
?c(0,0),c(0,1),c(1,0),c(1,1)
...
0
votes
1
answer
110
views
Numpy optimisation - C and F_Contiguous Matrices
I got intrigued by the discussion in http://scipy.github.io/old-wiki/pages/PerformanceTips on how to get faster dot computations.
It is concluded dotting C_contiguous matrices should be faster, and ...
3
votes
1
answer
806
views
BLAS routine to compute diagonal elements only of a matrix product?
Say I have two matrices A and B. I want to compute the diagonal elements of the matrix product A * B and place them in a pre-allocated vector result.
Is there a BLAS (or similar) routine to do this ...
1
vote
1
answer
315
views
Is there a LAPACK function for zeroing out the upper / lower corner of a matrix?
Some LAPACK functions (like dgqrf) return a function where the answer is upper triangular but then there's some auxilary information stored below the diagonal. I'm wondering if there's a function that ...
2
votes
0
answers
169
views
Triangular matrix matrix multiply `trmm` in TensorFlow
I need to get fastest possible matmul operation in TF for the case when one of the matrices is lower triangular. The cuBLAS and the BLAS have trmm functions, but looks like TensorFlow doesn't benefit ...
2
votes
2
answers
877
views
Why use MKL's Zgemm when gemm3m is the same but faster?
According to MKL's documentation:
The ?gemm3m routines perform a matrix-matrix operation with general
complex matrices. These routines are similar to the ?gemm routines,
but they use fewer ...
5
votes
3
answers
2k
views
What value of alignment should I with mkl_malloc?
The function mkl_malloc is similar to malloc but has an extra alignment argument. Here's the prototype:
void* mkl_malloc (size_t alloc_size, int alignment);
I've noticed different performances with ...