Skip to content
#

data-processing

Here are 533 public repositories matching this topic...

edogrigqv2
edogrigqv2 commented Dec 13, 2020

🚨🚨 Feature Request

We need description, citation, license, and version meta info to be added to the dataset.

Is your feature request related to a problem?

Some datasets need this info inside them for legal reasons.

If your feature will improve HUB

Easy to implement, won't hurt for sure.

Description of the possible solution

Currently, we have all metadata store

pysparkling
svaningelgem
svaningelgem commented Jan 27, 2021

The exception in subject is thrown by the following code:

from datetime import date
from pysparkling.sql.session import SparkSession
from pysparkling.sql.functions import collect_set

spark = SparkSession.Builder().getOrCreate()

dataset_usage = [
    ('steven', 'UUID1', date(2019, 7, 22)),
]
dataset_usage_schema = 'id: string, datauid: string, access_date: date'

df = spa
hunterhector
hunterhector commented Sep 10, 2020

Is your feature request related to a problem? Please describe.
To prepare medical NER detection, we need to create a reader for the BC5CDR in the BLUE Benchmark: https://github.com/ncbi-nlp/BLUE_Benchmark

Describe the solution you'd like

  1. Develop a reader for BC5CDR
  2. Annotate the Entity Mentions from the dataset.

Describe alternatives you've considered
A clear and concise

Improve this page

Add a description, image, and links to the data-processing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-processing topic, visit your repo's landing page and select "manage topics."

Learn more