cuda

PR #6447 adds a public API to get the maximum number of registers per thread (numba.cuda.Dispatcher.get_regs_per_thread()). There are other attributes that might be nice to provide - shared memory per block, local memory per thread, const memory usage, maximum block size.

These are all available in the FuncAttr named tuple: https://github.com/numba/numba/blob/master/numba/cuda/cudadrv/drive

Problem:
catboost version: 0.23.2
Operating System: all
Tutorial: https://github.com/catboost/tutorials/blob/master/custom_loss/custom_metric_tutorial.md

Impossible to use custom metric (С++).

Code example

from catboost import CatBoost
train_data = [[1, 4, 5, 6],

Is your feature request related to a problem? Please describe.
from time series analysis, need to support spearman correlation matrix calculation in cuDF

Describe the solution you'd like
similar like pandas.DataFrame().corr(method='spearman')

Additional context
is it possible to let me know this feature adding roadmap? big thanks !

Current implementation of join can be improved by performing the operation in a single call to the backend kernel instead of multiple calls.

This is a fairly easy kernel and may be a good issue for someone getting to know CUDA/ArrayFire internals. Ping me if you want additional info.

Names map and input are exchanged mistakenly. By sense of Preconditions paragraph they have to be exchanged I suppose, because there is no problem when map and result coincide (in current context).

The marker no_bad_cuml_array_check was necessary in 0.16 to avoid asserting on bad uses of CumlArray in specific tests. As of 0.17, this is no longer necessary.

This marker should be removed from pytest.ini as well as any tests that used it. A quick search shows this is used in 2 tests in test_incrementa_pca.py

I often use -v just to see that something is going on, but a progress bar (enabled by default) would serve the same purpose and be more concise.

We can just factor out the code from futhark bench for this.

Thank you for this fantastic work!

Could it be possible the fit_transform() method returns the KL divergence of the run?

Thx!

cuda

Here are 2,548 public repositories matching this topic...

NVIDIA / nvidia-docker

kaldi-asr / kaldi

hashcat / hashcat

numba / numba

catboost / catboost

chainer / chainer

cupy / cupy

taskflow / taskflow

intel-isl / Open3D

hybridgroup / gocv

rapidsai / cudf

arrayfire / arrayfire

NVIDIA / thrust

uber / aresdb

ROCm-Developer-Tools / HIP

rapidsai / cuml

dmlc / nnvm

Celtoys / Remotery

NVIDIA / libcudacxx

diku-dk / futhark

graphistry / pygraphistry

AlexiaJM / Deep-learning-with-cats

mp3guy / ElasticFusion

QuantScientist / Deep-Learning-Boot-Camp

Xtra-Computing / thundersvm

CannyLab / tsne-cuda

inducer / pycuda

sniklaus / 3d-ken-burns

NVIDIA / cutlass

NVIDIA / MinkowskiEngine

Improve this page

Add this topic to your repo

Essential cookies

Always active

Analytics cookies