apache-spark

Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Here are 1,082 public repositories matching this topic...
-
Updated
May 26, 2019 - Scala
-
Updated
Mar 31, 2021 - JavaScript
-
Updated
Jul 30, 2021 - Jupyter Notebook
-
Updated
Jul 14, 2021 - Java
This is to track implementation of the ML-Features: https://spark.apache.org/docs/latest/ml-features
Bucketizer has been implemented in dotnet/spark#378 but there are more features that should be implemented.
- Feature Extractors
- TF-IDF
- Word2Vec (dotnet/spark#491)
- CountVectorizer (https://github.com/dotnet/spark/p
-
Updated
Jul 30, 2021 - Go
-
Updated
Apr 20, 2021 - Dockerfile
Currently lakeFS register openapi handlers and handle all specific routes.
In case of a call to /api/v1/test
, the unknown path under the API prefix, the mux will serve the request by the UI handler and return a valid HTML (UI) page.
The expected behaviour is to return a non-2xx status code with JSON error - prefered the internal error format, so the developer will handle an error and not fai
-
Updated
Jul 6, 2021
-
Updated
Dec 31, 2020 - Python
-
Updated
Dec 3, 2019 - Python
-
Updated
Jan 29, 2021 - C#
-
Updated
Mar 9, 2020 - Python
-
Updated
Jul 16, 2021 - R
-
Updated
Jan 24, 2017 - Scala
-
Updated
Jul 25, 2018 - Python
-
Updated
Jan 8, 2020 - Scala
-
Updated
Jun 13, 2021 - Python
-
Updated
Mar 31, 2018
-
Updated
Feb 22, 2021 - Java
-
Updated
Apr 15, 2021 - Scala
-
Updated
Jul 27, 2021
-
Updated
May 28, 2021 - Jupyter Notebook
-
Updated
Jun 15, 2021 - Scala
-
Updated
May 23, 2021
-
Updated
Mar 30, 2021 - Python
-
Updated
Sep 14, 2015 - Shell
Created by Matei Zaharia
Released May 26, 2014
- Repository
- apache/spark
- Website
- spark.apache.org
- Wikipedia
- Wikipedia
URLS with the issue:
Description of proposal:
Document the maximum value and legal characters for log_param, log_metric and set_tag. Note that log_metric's value i