Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
1 vote
0 answers
251 views

Change the BLAS version used by R

I am currently having issues with the 'Eigen()' function in R. It was mentioned that I should try using 'OpenBLAS'. How abouts should I go about installing this and make R use this version of BLAS. I ...
Dylan Dijk's user avatar
1 vote
1 answer
243 views

How to force Julia to use multiple threads for matrix multiplication?

I want to find powers of a relatively small matrix, but this matrix consists of rational numbers of type Rational{BigInt}. By default, Julia utilizes only a single thread for such computations. I want ...
Yrogirg's user avatar
  • 2,373
0 votes
1 answer
105 views

Why multiplying wide matrices are slower than square matrices?

I have noticed the following while trying to increase the performance of my code: >>> a, b = torch.randn(1000,1000), torch.randn(1000,1000) >>> c, d = torch.randn(10000, 100), torch....
Fırat Kıyak's user avatar
0 votes
1 answer
172 views

What is wrong with my sparse matrix-multiple vectors (SpMM) product function for CSR?

I have the following code for the sparse matrix-vector (SpMV) product in C assuming a CSR storage format: void dcsrmv(SparseMatrixCSR *A, double *x, double *y) { for (int i=0; i<A->m; i++) { ...
Nicolas Venkovic's user avatar
-1 votes
1 answer
136 views

LAPACK different outputs when using solvers for banded matricies

I've been stuck at this for hours and hoping someone can figure out what I am missing. I am solving Ax=B firstly using DGESV which I am 99% sure is correct. Then I am puting A into a banded form and ...
William Dennis's user avatar
1 vote
0 answers
37 views

Optimize multiple weighted matrix multiplications summations

I have to calculate the linear sum of multiple matrix multiplications, i.e., F = A1 * X@Y + A2 * X@Z + A3 * Y@Z Where X, Y and Z are large matrices and A1, A2 and A3 are scalars. I can calculate them ...
Niteya Shah's user avatar
  • 1,824
0 votes
1 answer
582 views

What is the time complexity of Trsm and other BLAS operations?

I am accelerating a model by replacing all its linear algebra operations with cuBlas's functions. And I want to get the time complexity or FLOPs of the model to evaluate its performance in roofline ...
TherLF's user avatar
  • 13
1 vote
0 answers
212 views

Does TensorFlow and PyTorch have any specialized matrix multiplication operation for Symmetric or Triangular matrices?

Does TensorFlow and PyTorch have any specialized functions for multipling matrices with special properties? For example, consider the matrix multiplication: C := AB where A and B are n x n The cost ...
Aravind Sankaran's user avatar
0 votes
1 answer
54 views

how can I call dgemm in Excel VBA?

How can I call blas dgemm in Excel VBA? My blas is Rblas.dll from an R installation. If I put a breakpoint before the subroutine ends and go to the immediate window ?c(0,0),c(0,1),c(1,0),c(1,1) ...
Actuary David's user avatar
0 votes
1 answer
110 views

Numpy optimisation - C and F_Contiguous Matrices

I got intrigued by the discussion in http://scipy.github.io/old-wiki/pages/PerformanceTips on how to get faster dot computations. It is concluded dotting C_contiguous matrices should be faster, and ...
user37292's user avatar
  • 414
3 votes
1 answer
806 views

BLAS routine to compute diagonal elements only of a matrix product?

Say I have two matrices A and B. I want to compute the diagonal elements of the matrix product A * B and place them in a pre-allocated vector result. Is there a BLAS (or similar) routine to do this ...
a06e's user avatar
  • 20.9k
1 vote
1 answer
315 views

Is there a LAPACK function for zeroing out the upper / lower corner of a matrix?

Some LAPACK functions (like dgqrf) return a function where the answer is upper triangular but then there's some auxilary information stored below the diagonal. I'm wondering if there's a function that ...
John's user avatar
  • 2,719
2 votes
0 answers
169 views

Triangular matrix matrix multiply `trmm` in TensorFlow

I need to get fastest possible matmul operation in TF for the case when one of the matrices is lower triangular. The cuBLAS and the BLAS have trmm functions, but looks like TensorFlow doesn't benefit ...
Artem Artemev's user avatar
2 votes
2 answers
877 views

Why use MKL's Zgemm when gemm3m is the same but faster?

According to MKL's documentation: The ?gemm3m routines perform a matrix-matrix operation with general complex matrices. These routines are similar to the ?gemm routines, but they use fewer ...
avgn's user avatar
  • 1,002
5 votes
3 answers
2k views

What value of alignment should I with mkl_malloc?

The function mkl_malloc is similar to malloc but has an extra alignment argument. Here's the prototype: void* mkl_malloc (size_t alloc_size, int alignment); I've noticed different performances with ...
avgn's user avatar
  • 1,002

15 30 50 per page