Skip to content
#

datasets

Here are 1,468 public repositories matching this topic...

datasets
dlwh
dlwh commented Mar 16, 2022

Describe the bug

Streaming Datasets can't be pickled, so any interaction between them and multiprocessing results in a crash.

Steps to reproduce the bug

import transformers
from transformers import Trainer, AutoModelForCausalLM, TrainingArguments
import datasets

ds = datasets.load_dataset('oscar', "unshuffled_deduplicated_en", split='train', streaming=True).with_format("
bug good first issue
label-studio
omishali
omishali commented Jan 3, 2022

Describe the bug
I am trying to label Hebrew text (RTL language). When labels are attached to the text, the words of the text are mixed and not shown in their original order.

To Reproduce
Steps to reproduce the behavior:

  1. Create a project with attached dataset.json dataset.txt
  2. Choose NER template
  3. Start
bug good first issue text editor
ljades
ljades commented Feb 19, 2021

How to reproduce the behaviour

The error occurs in the Step 5/9 of the docker build process

fetch http://dl-cdn.alpinelinux.org/alpine/v3.11/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.11/community/x86_64/APKINDEX.tar.gz
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.11/main/x86_64/APKINDEX.tar.gz: BAD signature
WARNING: Ignoring http
good first issue
AbhinavTuli
AbhinavTuli commented Mar 22, 2022

🚨🚨 Feature Request

  • A new implementation (Improvement, Extension)

Is your feature request related to a problem?

Currently, if a user tries to access an index that is larger than the dataset length or tensor length, an internal error is thrown which is not easy to understand.

Description of the possible solution

We can catch the error and throw a more descriptive e

enhancement good first issue
colour
atomotic
atomotic commented May 31, 2022

the pre-built binary is not supporting database?

roapi -t "vocabs=sqlite:///data/vocabulary.sqlite"
[2022-05-31T06:48:11Z INFO  roapi::context] loading `uri(sqlite:///data/vocabulary.sqlite)` as table `vocabs`
Error: Database error: Enable 'database' feature flag to support this

would you explain in README how to enable it?

I'm new to rust, after some searching i got it workin

bug good first issue help wanted

Improve this page

Add a description, image, and links to the datasets topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the datasets topic, visit your repo's landing page and select "manage topics."

Learn more