Skip to content
#

PyTorch

pytorch logo

PyTorch is an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook's AI Research lab.

Here are 20,072 public repositories matching this topic...

transformers
NielsRogge
NielsRogge commented Jan 2, 2022

Related to #5142, AlbertTokenizer (which uses SentencePiece) doesn't decode special tokens (like [CLS], [MASK]) properly. This issue was discovered when adding the Nystromformer model (#14659), which uses this tokenizer.

To reproduce (Transformers v4.15 or below):

!pip install -q transformers sentencepiece

from transformers import AlbertTokenizer

tokenizer = AlbertTokenizer.from
hyq12358
hyq12358 commented Dec 28, 2021

I want to train a detector based on object365 dataset, but object365 is pretty large, and caused out of memory error in my computer.
I want to split the annotation file to 10, such as ann1,ann2,...ann10, then build 10 datasets and concatenate them, but I'm not sure whether it's
gonna work or not.
Any better suggestion?

pytorch-lightning
TidalPaladin
TidalPaladin commented Feb 5, 2022

🐛 Bug

If apply_to_collections is called on two dataclasses, the passed function will be called on only one of the input dataclasses.

Dataclass inputs for apply_to_collection are handled by a conditional:

https://github.com/PyTorchLightning/pytorch-lightning/blob/8ddf9f996d4a98e7be5fc591b7734b6f78313771/pytorch_lightning/utilities/apply_func.py#L128

apply_to_collections does

AnirudhDagar
AnirudhDagar commented Jan 24, 2022

Although the results look nice and ideal in all TensorFlow plots and are consistent across all frameworks, there is a small difference (more of a consistency issue). The result training loss/accuracy plots look like they are sampling on a lesser number of points. It looks more straight and smooth and less wiggly as compared to PyTorch or MXNet.

It can be clearly seen in chapter 6([CNN Lenet](ht

datasets
ck37
ck37 commented Jan 20, 2022

Is your feature request related to a problem? Please describe.

I am uploading our dataset and models for the "Constructing interval measures" method we've developed, which uses item response theory to convert multiple discrete labels into a continuous spectrum for hate speech. Once we have this outcome our NLP models conduct regression rather than classification, so binary metrics are not r

chan4cc
chan4cc commented Apr 26, 2021

New Operator

Describe the operator

Why is this operator necessary? What does it accomplish?

This is a frequently used operator in tensorflow/keras

Can this operator be constructed using existing onnx operators?

If so, why not add it as a function?

I don't know.

Is this operator used by any model currently? Which one?

Are you willing to contribute it?

nni
danieldeutsch
danieldeutsch commented Jun 2, 2021

Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict command opens the file and reads lines for the Predictor. This fails when it tries to load data from my compressed files.

Created by Facebook's AI Research lab (FAIR)

Released September 2016

Latest release about 2 months ago

Repository
pytorch/pytorch
Website
pytorch.org
Wikipedia
Wikipedia

Related Topics

python pytorch-tutorial