All Questions
140 questions
0
votes
0
answers
26
views
Is BLAS interface of cvxopt different from standard ones?
According to the official documentation, the description of BLAS routine tbmv seems no different from the standard routines, for example found in the Intel's MKL Manual.
However, running the ...
0
votes
0
answers
46
views
Installation of C++ libraries 'Boost' and 'BLAS' for Python project fail on Windows
I'm working on a Python project. After cloning a remote git repository I followed the instructions in the README file, executing multiple pip install commands in my VSCode PowerShell terminal to set ...
2
votes
1
answer
143
views
Why is libopenblas from numpy so big?
We are deploying an open source application based on numpy that includes libopenblas.{cryptic string}.gfortran-win32.dll. It is part of the Python numpy package. This dll is over 27MB in size. I'm ...
1
vote
0
answers
114
views
Why BLAS cblas_sgemm in C is slower than np.dot?
I made a simple benchmark between Python NumPy and C OpenBLAS to multiply two 500x500 matrices. It seems that np.dot performs almost 9 times faster than cblas_sgemm. Is there anything I'm doing wrong?
...
0
votes
1
answer
168
views
How do I make np.multiply use more than one core?
The title says it already. I am currently parallelizing my code and a major bottleneck is posed by element-wise multiplication of two three-dimensional ndarrays. My system monitor reveals that only ...
1
vote
1
answer
160
views
How Does NumPy Internally Handle Matrix Multiplication with Non-continuous Slices?
Hello Stack Overflow community,
I'm working with NumPy for matrix operations and I have a question regarding how NumPy handles matrix multiplication, especially when dealing with non-continuous slices ...
1
vote
0
answers
109
views
numpy built with locally built blis does not use multithreading
I'm looking for help with an issue I'm having building Numpy against locally built blis for zen3.
I've configured blis to enable threading using openmp. (it is installed and working on my machine, ...
0
votes
0
answers
109
views
Performance of DGEMM with f2py
I tried to wrap dgemm in fortran via f2py and comparing the time. Looks like dgemm is much slower (a factor of 10) than numpy-einsum in small dimension matrices. The timer of dgemm is inside fortran ...
1
vote
1
answer
256
views
Why does linalg.solve doesnt use all available threads
I would like to know why linalg.solve from numpy isnt using all available threads to do its calculus.
I'm using it to solve for a multidimensional system, in a way that it should solve to find a ...
0
votes
1
answer
854
views
where can I find the numpy.matmul() source code?
I do not obtain the same results when I use np.matmul(A, b) in Python and when I use xtensor-blas's xt::linalg::dot(A, b) in C++.
I am investigating the reasons, as when saved and read from disk, A ...
2
votes
1
answer
2k
views
Numpy vectorization and algorithmic complexity
I have read many times about vectorized code in numpy. I know for a fact that a python for loop can be ~100x times slower than an equivalent numpy operation. However, I thought that the power of numpy ...
0
votes
0
answers
29
views
Find blas calls happening under the hood when running inference
I have a trained model saved with tf.saved_model.save; loaded back with tf.saved_model.load . Now I'd like to know what blas routines are called when I run inference on this model. I'm curious about ...
4
votes
0
answers
473
views
Why is Scipy.linalg.eigh much slower when using np.complex64?
I have a large Hermitian matrix of which I need to calculate the eigenvalues and eigenvectors. For this I use scipy.linalg.eigh. Above a certain size of the matrix, scipy is much faster if np....
0
votes
1
answer
616
views
LINK : fatal error LNK1181: cannot open input file 'lapack.lib'
LINK : fatal error LNK1181: cannot open input file 'lapack.lib'
error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2022\\BuildTools\\VC\\Tools\\MSVC\\14.34.31933\\bin\\HostX86\\x64\\...
4
votes
0
answers
511
views
Importing Numpy fails after building from source against amd blis
I'm trying to build a local version of Numpy from source against BLIS (for BLAS and CBLAS) and against OpenBLAS for LAPACK.
I started with building BLIS locally for zen3 with CBLAS enabled, like so:
./...