Skip to content
#

big-data

Here are 2,614 public repositories matching this topic...

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • Updated May 13, 2021
  • Python
Bluenix2
Bluenix2 commented Aug 7, 2021

Is your feature request related to a problem? Please describe.
Many static type checkers have issues finding Cython's stubs.
Here is from running mypy on my current project:

error: Skipping analyzing "cython": found module but no type hints or library stubs

The same issue can be seen when using import Cython as cython:

error: Skipping analyzing "Cython": found module but 

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • Updated Sep 18, 2021
  • Jupyter Notebook
findepi
findepi commented Sep 6, 2021

With Hive connector

trino:default> CREATE TABLE one (a varchar);
            -> CREATE VIEW two AS SELECT * FROM one;
CREATE TABLE
CREATE VIEW

DROP TABLE is rejected on a view:

trino:default> DROP TABLE two;
Query 20210906_150832_00015_id3y3 failed: line 1:1: Table 'hive.default.two' does not exist, but a view with that name exists. Did you mean DROP VIEW hive.default.t
jovanpop-msft
jovanpop-msft commented Aug 18, 2021

Could we clarify that delta-log files are JSON line-delimited files in https://github.com/delta-io/delta/blob/master/PROTOCOL.md#delta-log-entries ?

In the PROTOCOL.md file it is not clear what is the format of JSON. Every delta-log entry file is "new-line delimited json file", but this is not specified in this file. Protocol do not explicitly specify that every action is stored as a single-lin

vespa
kkraune
kkraune commented Apr 2, 2021

... to make it easier to read Vespa documentation on an e-reader / offline

Vespa documentation is generated using Jekyll from .md and .html files, look into options for generating the artifact as part of site generation (there might be plugins we can use here)

seut
seut commented Jun 22, 2021

Use case:

1.) A user may want to backup all tables but no metadata like users, privileges, etc. without explicitly defining each table inside the CREATE SNAPSHOT statement.

2.) A user may want to transfer users & privileges, custom analyzers or user-defined-functions from one cluster to another without backing up a complete cluster including all data (tables).

*Feature description

Improve this page

Add a description, image, and links to the big-data topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the big-data topic, visit your repo's landing page and select "manage topics."

Learn more