Skip to content
#

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 21,476 public repositories matching this topic...

reshamas
reshamas commented Aug 6, 2021

Describe the issue linked to the documentation

The "20 newsgroups text" dataset can be accessed within scikit-learn using defined functions. The dataset contains some text which is considered culturally insensitive.

Suggest a potential alternative/fix

Add a section in the dataset documentation, possibly above the "Recommendation" section called "Data Considerations".
https://

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated May 13, 2021
  • Python
edoakes
edoakes commented Sep 8, 2021

From a slack message:

Hi, So I observed that if you deploy a deployment with more replicas than the available resources serve keeps trying to allocate them waiting for autoscaler.

(pid=125021) 2021-09-07 20:52:42,899    INFO http_state.py:75 -- Starting HTTP proxy with name 'pfaUeM:SERVE_CONTROLLER_ACTOR:SERVE_PROXY_ACTOR-node:192.168.1.13-0' on node 'node:192.168.1.13-0' listening on '12
pytorch-lightning
dash
MrMino
MrMino commented Sep 8, 2021

Minor, non-breaking issue found during review of #13094.

If path of the active virtualenv is a substring of another virtualenv, IPython started from the second one will not fire up any warning.

Example:

virtualenv aaa
virtualenv aaaa
. aaaa/bin/activate
python -m pip install ipython
. aaa/bin/activate
aaaa/bin/ipython

Expected behavior after executing aaaa/bin/ipython:

anntzer
anntzer commented Aug 26, 2021

Problem

3d axes don't support the data kwarg:

gcf().add_subplot(projection="3d").scatter("a", "b", "c", data={"a": [0], "b": [1], "c": [2]})

results in

ValueError: could not convert string to float: 'a'

Proposed solution

I think it's "mostly" a matter of adding a bunch of @_preprocess_data decorators to 3D plotting methods similarly to what's done for 2D plots

danieldeutsch
danieldeutsch commented Jun 2, 2021

Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict command opens the file and reads lines for the Predictor. This fails when it tries to load data from my compressed files.

nni