Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
Here are 19,512 public repositories matching this topic...
We should be using pkg_resources
(or importlib.resources
if our min Python version is 3.7) instead of uses of __file__
.
$ get grep '__file__' sklearn/
sklearn/__check_build/__init__.py: local_dir = os.path.split(__file__)[0]
sklearn/datasets/_base.py: module_path = dirname(__file__)
sklearn/datasets/_base.py: module_path = dirname(__file__)
sklearn/datasets/_base.py:
While loading, cards link to an undefined page causing a 404.
Expected results
There should not be an active link while the cards are in a loading state
Actual results
The cards attempt to link to their relevant resource, but instead link to e.g. /chart/list/undefined
. If you click a card while it's loading you get a 404 page.
How to reproduce the bug
- Go to the
-
Updated
Jun 14, 2021 - Jupyter Notebook
-
Updated
May 12, 2021 - Jupyter Notebook
-
Updated
May 13, 2021 - Python
-
Updated
Jun 20, 2021 - Python
-
Updated
May 17, 2021 - Python
Trying out a simple example using TuneSearchCV with LGBMClassifier and it fails on start.
Environment:
Python 3.8.3
tune-sklearn 0.3.0
ray 1.3.0
macos mojave 10.14.6
Code:
from ray.tune.sklearn import TuneSearchCV
from lightgbm import LGBMClassifier
lgmb_param_dists = dict(
boosting_type=['gbdt','dart','rf'],
num_leaves=(10,500),
-
Updated
Jun 16, 2021
Summary
The grayish background oval indicating a selected st.radio
label has too much padding on the right hand side by a few pixels. Here's an example:
(Notice how the background rounded rectangle extends further to the right past "Notion" than it does to the left of the sel
The docs for IPython.core.interactiveshell.InteractiveShell.set_custom_exc
have horribly mangled a warning message into a list of arguments. I can't work out at a glance why this is happening; it might be a sphinx.ext.napoleon
bug, or a sphi
In recent versions (can't say from exactly when), there seems to be an off-by-one error in dcc.DatePickerRange. I set max_date_allowed = datetime.today().date()
, but in the calendar, yesterday is the maximum date allowed. I see it in my apps, and it is also present in the first example on the DatePickerRange documentation page.
E
🐛 Bug
This is a fairly important bug report that I've been meaning to make for a while.
In general, it is incorrect to try to do testing with a distributed sampler. This is because the distributed sampler is either going to mix in already processed samples or drop samples in order to make the number of batches divide evenly on the number of GPUs.
This is fine when you're doing tra
When plotting plt.plot(np.ones(10), np.ones((10,0)) it raises a ZeroDivisionError, which confused me much.
Code for reproduction
import matplotlib.pyplot as plt
import numpy as np
plt.plot(np.ones(10), np.ones((10,0)))
This raises the error:
ZeroDivisionError: integer division or modulo by zero
Expected outcome
I think however, it should either r
-
Updated
Apr 16, 2021 - JavaScript
-
Updated
May 30, 2021 - Jupyter Notebook
-
Updated
May 20, 2020
(triggered by SO question: https://stackoverflow.com/questions/67944732/using-my-own-stopword-list-with-gensim-corpora-textcorpus-textcorpus/67951592#67951592)
Gensim has two remove_stopwords()
functions with similar, but slightly-different behavior that risks confusing users.
gensim.parsing.preprocessing.remove_stopwords
takes a space-delimited string, and always consults the current
-
Updated
May 2, 2021
-
Updated
Oct 16, 2020 - Jupyter Notebook
-
Updated
May 16, 2021
-
Updated
Jun 4, 2021
-
Updated
Jun 18, 2021 - Python
Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict
command opens the file and reads lines for the Predictor
. This fails when it tries to load data from my compressed files.
-
Updated
Jun 20, 2021 - Python
-
Updated
Jun 16, 2021
-
Updated
Jun 18, 2021 - Python
-
Updated
Jan 25, 2021 - Python
- Wikipedia
- Wikipedia
(e.g. for links and images), because some of these examples are now being rendered in the docs.
Added by @fchollet in requests for contributions.