Skip to content
#

big-data

Here are 2,535 public repositories matching this topic...

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated May 13, 2021
  • Python
kloczek
kloczek commented Jun 9, 2021

After add patch which fixes #4209 I found that sphinx emits some warnings.

+ /usr/bin/python3 setup.py build_sphinx -b man --build-dir build/sphinx
Unable to find pgen, not compiling formal grammar.
running build_sphinx
Running Sphinx v4.0.2
making output directory... done
loading intersphinx inventory from https://docs.python.org/3/objects.inv...
building [mo]: targets for 0 po

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • Updated Jul 24, 2021
  • Jupyter Notebook
cpsnowden
cpsnowden commented Jun 3, 2021

When you change a query state filter e.g. showing 'Finished' queries or 'User Error' queries, the Show Limit is not respected immediately resulting in all queries being rendered. If a re-order interval is set then Show Limit is applied on the next query refresh. If there are a large number of queries in the history this can result in the browser crashing on the first render.

The following two c

vespa
kkraune
kkraune commented Apr 2, 2021

... to make it easier to read Vespa documentation on an e-reader / offline

Vespa documentation is generated using Jekyll from .md and .html files, look into options for generating the artifact as part of site generation (there might be plugins we can use here)

jaceklaskowski
jaceklaskowski commented Jun 15, 2021
  • Delta Lake 1.0.0
  • Spark 3.1.2
  • Scala 2.12
  • AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)

The following code gives a NullPointerException. This is for a directory-based delta table that does not exist and uses a generated column.

import io.delta.tables.DeltaTable
DeltaTable.create
  .addColumn(
    DeltaTable.columnBuilder("value")
      .generatedAlwaysAs("true")
      .nullab
seut
seut commented Jun 22, 2021

Use case:

1.) A user may want to backup all tables but no metadata like users, privileges, etc. without explicitly defining each table inside the CREATE SNAPSHOT statement.

2.) A user may want to transfer users & privileges, custom analyzers or user-defined-functions from one cluster to another without backing up a complete cluster including all data (tables).

*Feature description

Improve this page

Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."

Learn more