Skip to content
#

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 21,258 public repositories matching this topic...

reshamas
reshamas commented Aug 6, 2021

Describe the issue linked to the documentation

The "20 newsgroups text" dataset can be accessed within scikit-learn using defined functions. The dataset contains some text which is considered culturally insensitive.

Suggest a potential alternative/fix

Add a section in the dataset documentation, possibly above the "Recommendation" section called "Data Considerations".
https://

superset
nguyenluongky
nguyenluongky commented Aug 26, 2021

Currently, we use Native filter on Superset version 1.2, but looks like The actual time range does not show correctly with SIP-15 (in the SIP-15 the time range must is [inclusive, exclusive) ). So that mean the actual time range and the tool tip must show label as: from_date <= col < to_date.

Expected results

![image](https://user-images.githubusercontent.com/37523968/130939207-7ff847a

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated May 13, 2021
  • Python
pytorch-lightning
dash
anntzer
anntzer commented Aug 26, 2021

Problem

3d axes don't support the data kwarg:

gcf().add_subplot(projection="3d").scatter("a", "b", "c", data={"a": [0], "b": [1], "c": [2]})

results in

ValueError: could not convert string to float: 'a'

Proposed solution

I think it's "mostly" a matter of adding a bunch of @_preprocess_data decorators to 3D plotting methods similarly to what's done for 2D plots

gensim
danieldeutsch
danieldeutsch commented Jun 2, 2021

Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict command opens the file and reads lines for the Predictor. This fails when it tries to load data from my compressed files.

nni