Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

@fchollet

(e.g. for links and images), because some of these examples are now being rendered in the docs.

Added by @fchollet in requests for contributions.

We should be using pkg_resources (or importlib.resources if our min Python version is 3.7) instead of uses of __file__.

$ get grep '__file__' sklearn/
sklearn/__check_build/__init__.py:    local_dir = os.path.split(__file__)[0]
sklearn/datasets/_base.py:    module_path = dirname(__file__)
sklearn/datasets/_base.py:    module_path = dirname(__file__)
sklearn/datasets/_base.py:

Keyboard navigation in the control panel of the Explore view is difficult.

Expected results

You should be able to move focus between adjacent controls in the control panel with a single Tab key press
and visually distinguish what element has focus. You should be able to interact with controls the keyboard
(Enter or space bar for button-like things).

Actual results

Several tab

What is the problem?

After running tune.run, the experiment results are missing from progress.csv but are in result.json.
A possible solution is written by mannyv: https://discuss.ray.io/t/saving-checkpoints-with-good-custom-metric-using-tune-run/2109/12

Ray version and other system information (Python version, TensorFlow version, OS):

Ray version 1.2.0.
Tensorflow 1.15.4.
Python

Summary

The grayish background oval indicating a selected st.radio label has too much padding on the right hand side by a few pixels. Here's an example:

(Notice how the background rounded rectangle extends further to the right past "Notion" than it does to the left of the sel

Steps to reproduce

run %autocall random

Expected result

ERROR:root:Valid modes: (0->Off, 1->Smart, 2->Full

Observed result

ValueError was raised due to parsing the argument "random" as an integer.

System info

Manjaro Linux, Python 3.9.1, IPython 7.22.0.

In recent versions (can't say from exactly when), there seems to be an off-by-one error in dcc.DatePickerRange. I set max_date_allowed = datetime.today().date(), but in the calendar, yesterday is the maximum date allowed. I see it in my apps, and it is also present in the first example on the DatePickerRange documentation page.

E

🐛 Bug

If accumulate_grad_batches is enabled, we don't call on_after_backward until we step the optimizers

https://github.com/PyTorchLightning/pytorch-lightning/blob/d209b689796719d1ab4fcc8e1c26b8b57cd348c4/pytorch_lightning/trainer/training_loop.py#L757-L763

This means on_after_backward is acting like on_before_optimizer_step.

So we should add that and always run `on_after_b

Describe the issue

Currently, we have 0 test coverage for widgets like TextBox.
Other widgets are tested here: https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/tests/test_widgets.py

Summary
Following up from matplotlib/matplotlib#20367 (comment), there might exist more widgets that aren't tested at all, but TextBox seems likel

(triggered by SO question: https://stackoverflow.com/questions/67944732/using-my-own-stopword-list-with-gensim-corpora-textcorpus-textcorpus/67951592#67951592)

Gensim has two remove_stopwords() functions with similar, but slightly-different behavior that risks confusing users.

gensim.parsing.preprocessing.remove_stopwords takes a space-delimited string, and always consults the current

Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict command opens the file and reads lines for the Predictor. This fails when it tries to load data from my compressed files.

Data Science

Here are 19,401 public repositories matching this topic...

keras-team / keras

scikit-learn / scikit-learn

apache / superset

Expected results

Actual results

GokuMohandas / MadeWithML

CamDavidsonPilon / Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

donnemartin / data-science-ipython-notebooks

explosion / spaCy

eriklindernoren / ML-From-Scratch

ray-project / ray

What is the problem?

academic / awesome-datascience

streamlit / streamlit

Summary

ipython / ipython

Steps to reproduce

Expected result

Observed result

System info

plotly / dash

PyTorchLightning / pytorch-lightning

🐛 Bug

matplotlib / matplotlib

Describe the issue

virgili0 / Virgilio

AMAI-GmbH / AI-Expert-Roadmap

fastai / fastbook

afshinea / stanford-cs-229-machine-learning

RaRe-Technologies / gensim

bharathgs / Awesome-pytorch-list

rasbt / python-machine-learning-book

hangtwenty / dive-into-machine-learning

eugeneyan / applied-ml

microsoft / recommenders

allenai / allennlp

d2l-ai / d2l-en

0xnr / awesome-bigdata

microsoft / nni

tflearn / tflearn

Related Topics