apache-spark

Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Here are 964 public repositories matching this topic...
-
Updated
May 26, 2019 - Scala
-
Updated
Jun 2, 2020 - JavaScript
-
Updated
Dec 9, 2020 - Jupyter Notebook
-
Updated
Nov 15, 2020 - Java
This is more a question than a feature request.
When parsing JSON files, I need to sanitize the field names so field with spaces
becomes field_with_spaces
.
I want to preserve the original name as well, metadata about the column if you like :)
There is a metadata field on StructField
, but it is internal.
Why is this internal, is it possible or desirable to expose it?
-
Updated
Dec 7, 2020 - Dockerfile
-
Updated
Dec 9, 2020 - Go
-
Updated
Dec 3, 2019 - Python
-
Updated
Oct 24, 2017 - Python
-
Updated
Nov 11, 2020
-
Updated
Oct 13, 2020 - C#
-
Updated
Dec 7, 2020 - R
-
Updated
Mar 9, 2020 - Python
-
Updated
Jan 24, 2017 - Scala
-
Updated
Jul 25, 2018 - Python
-
Updated
Jan 8, 2020 - Scala
-
Updated
Nov 22, 2020 - Python
-
Updated
Mar 31, 2018
-
Updated
Nov 8, 2020 - Java
-
Updated
Jul 29, 2020 - Jupyter Notebook
-
Updated
Nov 17, 2020
-
Updated
Nov 28, 2020
-
Updated
Oct 14, 2020 - Scala
-
Updated
Sep 14, 2015 - Shell
-
Updated
Jul 1, 2020 - Python
-
Updated
Jun 6, 2017
Created by Matei Zaharia
Released May 26, 2014
- Repository
- apache/spark
- Website
- spark.apache.org
- Wikipedia
- Wikipedia
Willingness to contribute
The MLflow Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the MLflow code base?