Skip to content
#

Data Science

Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.

Here are 27,522 public repositories matching this topic...

lesteve
lesteve commented Jun 15, 2022

Below is the list of broken links in the documention from a make linkcheck run, together with the file the link appears in and the error message.

If you want to work on this, please:

  • do one Pull Request per link
  • add a comment in this issue saying which link you want to tackle so that different people can work on this issue in parallel
  • **mention this issue (#23631) in yo
Easy Documentation good first issue Meta-issue
superset
rumbin
rumbin commented Jan 31, 2022

The Mixed Time-Series chart type allows for configuring the title of the primary and the secondary y-axis.
However, while only the title of the primary axis is shown next to the axis, the title of the secondary one is placed at the upper end of the axis where it gets hidden by bar values and zoom controls.

How to reproduce the bug

  1. Create a mixed time-series chart
  2. Configure axi
good first issue #bug validation:validated preset:cares

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated Apr 3, 2022
  • Python
VeronikaPolakova
VeronikaPolakova commented Jun 15, 2022

Reporters do not apply sort_by_metric if metric is passed through tune.run.

I tried to pass metric argument to CLIReporter, however, I obtained this error:

raise ValueError(
                "You passed a `metric` or `mode` argument to `tune.run()`, but "
                "the reporter you are using was already instantiated with their "
                "own `metric` and `mode
bug good first issue tune P2
asaini
asaini commented Oct 1, 2021

Problem

See #3856 . Developer would like the ability to configure whether the developer menu or viewer menu is displayed while they are developing on cloud IDEs like Gitpod or Github Codespaces

Solution

Create a config option

showDeveloperMenu: true | false | auto

where

  • true: always shows the developer menu locally and while deployed
  • false: always sho
enhancement good first issue
lightning
Keiku
Keiku commented Jun 20, 2022

README.md contains an execution example of the program in mnist_examples, but mnist_examples does not exist. If the code exists in any branch or in the past, recover it.

I wanted to solve the broken link of lightning_lite.rst together with this problem, but the PR Lightning-AI/lightning#13331 of only the fix of the link has been issued.

cc @Borda @rohitgr7

good first issue docs
dash
delsuc
delsuc commented Jun 22, 2022

Bug summary

( This bug was originally issued in ipympl: matplotlib/ipympl#471 )

I want to use a selector matplotlib.widgets.RectangleSelector , it is incompatible with zoom and pan
So I try to call widgetlock to prevent user from using zoom when the selector is active, hoping that the zoom or pan tools will not be blocked.
This does not work, but crashes rather.

status: confirmed bug Difficulty: Medium Good first issue
ethanfurman
ethanfurman commented Apr 25, 2022

The warnings at

https://ipython.readthedocs.io/en/stable/config/extensions/autoreload.html

do not mention the issues with reloading modules with enums:

  • Enum and Flag are compared by identity (is, even if == is used (similarly to None))
  • reloading a module, or importing the same module by a different name, creates new enums (look the same, but are not the same)
Data-Science-For-Beginners
soubhikmandal2000
soubhikmandal2000 commented Oct 31, 2021
  • Base README.md
  • Quizzes
  • Introduction base README
    • Defining Data Science README
    • Defining Data Science assignment
    • Ethics README
    • Ethics assignment
    • Defining Data README
    • Defining Data assignment
    • Stats and Probability README
    • Stats and Probability assignment
  • Working with Data base README
    • Rel
good first issue help wanted translations
AnirudhDagar
AnirudhDagar commented Jan 24, 2022

Although the results look nice and ideal in all TensorFlow plots and are consistent across all frameworks, there is a small difference (more of a consistency issue). The result training loss/accuracy plots look like they are sampling on a lesser number of points. It looks more straight and smooth and less wiggly as compared to PyTorch or MXNet.

It can be clearly seen in chapter 6([CNN Lenet](ht

tensorflow-adapt-track good first issue
gensim
mpenkov
mpenkov commented Jun 22, 2021

In gensim/models/fasttext.py:

    model = FastText(
        vector_size=m.dim,
        vector_size=m.dim,
        window=m.ws,
        window=m.ws,
        epochs=m.epoch,
        epochs=m.epoch,
        negative=m.neg,
        negative=m.neg,
        # FIXME: these next 2 lines read in unsupported FB FT modes (loss=3 softmax or loss=4 onevsall,
        # or model=3 supervi
bug difficulty easy good first issue fasttext
nni
danieldeutsch
danieldeutsch commented Jun 2, 2021

Is your feature request related to a problem? Please describe.
I typically used compressed datasets (e.g. gzipped) to save disk space. This works fine with AllenNLP during training because I can write my dataset reader to load the compressed data. However, the predict command opens the file and reads lines for the Predictor. This fails when it tries to load data from my compressed files.

Good First Issue Contributions welcome Feature request