data-processing

🚨🚨 Feature Request

We need description, citation, license, and version meta info to be added to the dataset.

Is your feature request related to a problem?

Some datasets need this info inside them for legal reasons.

If your feature will improve `HUB`

Easy to implement, won't hurt for sure.

Description of the possible solution

Currently, we have all metadata store

In this file, the kwargs of the optimizer does not match that of the PyTorch API. This part seems to be copied from the TF version.

Describe the bug
pa.errors.SchemaErrors.failure_cases only returns the first 10 failure_cases

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandera. 0.6.5
(optional) I have confirmed this bug exists on the master branch of pandera.

Note: Please read [this guide](https://matthewrocklin.c

Hello Benito,

For a specific task I need a "bitwise exclusive or"-function, but I realized xidel doesn't have one. So I created a function for that.

I was wondering if, in addition to the EXPath File Module, you'd be interested in integrating the EXPath Binary Module as well. Then I can use bin:xor() instead (although for

Write unit test coverage for SafeDataset and SafeDataLoader, along with the functions in utils.py.

The exception in subject is thrown by the following code:

from datetime import date
from pysparkling.sql.session import SparkSession
from pysparkling.sql.functions import collect_set

spark = SparkSession.Builder().getOrCreate()

dataset_usage = [
    ('steven', 'UUID1', date(2019, 7, 22)),
]
dataset_usage_schema = 'id: string, datauid: string, access_date: date'

df = spa

Is your feature request related to a problem? Please describe.
To prepare medical NER detection, we need to create a reader for the BC5CDR in the BLUE Benchmark: https://github.com/ncbi-nlp/BLUE_Benchmark

Describe the solution you'd like

Develop a reader for BC5CDR
Annotate the Entity Mentions from the dataset.

Describe alternatives you've considered
A clear and concise

data-processing

Here are 544 public repositories matching this topic...

lorien / awesome-web-scraping

NVIDIA / DALI

activeloopai / Hub

🚨🚨 Feature Request

Is your feature request related to a problem?

If your feature will improve HUB

Description of the possible solution

johnkerl / miller

asyml / texar

dashbitco / broadway

onceupon / Bash-Oneliner

python-bonobo / bonobo

microsoft / DialoGPT

TomWright / dasel

GoogleCloudPlatform / data-science-on-gcp

GoogleCloudPlatform / DataflowJavaSDK

asyml / texar-pytorch

pandera-dev / pandera

infoslack / awesome-kafka

benibela / xidel

constellation-rs / amadeus

kousun12 / eternal

msamogh / nonechucks

alttch / rapidtables

Yord / pxi

SebKrantz / collapse

maykulkarni / Machine-Learning-Notebooks

svenkreiss / pysparkling

streamnative / pulsar-flink

PytLab / VASPy

iTechArt / convtools

lithops-cloud / lithops

matousc89 / padasip

asyml / forte

Improve this page

Add this topic to your repo

If your feature will improve `HUB`