Skip to content
#

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 19,560 public repositories matching this topic...

jnothman
jnothman commented May 12, 2021

We should be using pkg_resources (or importlib.resources if our min Python version is 3.7) instead of uses of __file__.

$ get grep '__file__' sklearn/
sklearn/__check_build/__init__.py:    local_dir = os.path.split(__file__)[0]
sklearn/datasets/_base.py:    module_path = dirname(__file__)
sklearn/datasets/_base.py:    module_path = dirname(__file__)
sklearn/datasets/_base.py:    
superset
suddjian
suddjian commented Jun 17, 2021

While loading, cards link to an undefined page causing a 404.

Expected results

There should not be an active link while the cards are in a loading state

Actual results

The cards attempt to link to their relevant resource, but instead link to e.g. /chart/list/undefined. If you click a card while it's loading you get a 404 page.

How to reproduce the bug

  1. Go to the

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated May 13, 2021
  • Python
dash
pytorch-lightning
juergspaak
juergspaak commented Jun 16, 2021

When plotting plt.plot(np.ones(10), np.ones((10,0)) it raises a ZeroDivisionError, which confused me much.

Code for reproduction

import matplotlib.pyplot as plt
import numpy as np

plt.plot(np.ones(10), np.ones((10,0)))

This raises the error:

ZeroDivisionError: integer division or modulo by zero

Expected outcome

I think however, it should either r

gensim
gojomo
gojomo commented Jun 12, 2021

(triggered by SO question: https://stackoverflow.com/questions/67944732/using-my-own-stopword-list-with-gensim-corpora-textcorpus-textcorpus/67951592#67951592)

Gensim has two remove_stopwords() functions with similar, but slightly-different behavior that risks confusing users.

gensim.parsing.preprocessing.remove_stopwords takes a space-delimited string, and always consults the current

danieldeutsch
danieldeutsch commented Jun 2, 2021

Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict command opens the file and reads lines for the Predictor. This fails when it tries to load data from my compressed files.

nni