data-engineering

The Mixed Time-Series chart type allows for configuring the title of the primary and the secondary y-axis.
However, while only the title of the primary axis is shown next to the axis, the title of the secondary one is placed at the upper end of the axis where it gets hidden by bar values and zoom controls.

How to reproduce the bug

Create a mixed time-series chart
Configure axi

Current behavior

wait_for_flow_run currently times out after 12 hours. This value is hardcoded in watch_flow_run here.

so for subflows greater than 12 hours, this task will exit unexpectedly

Proposed behavior

Make it configurable by passing a value of

Tasks:

Port the content from GH readme to Docusaurus (main Docs website)
Incorporate relevant CLI content into the Getting Started with Airbyte OSS guide
Identify other places in Docs where we can incorporate CLI content

Describe the bug
data docs columns shrink to 1 character width with long query

To Reproduce
Steps to reproduce the behavior:

make a batch from a long query string
run validation
render result to data docs
See screenshot
<img width="1525" alt="Data_documentation_compiled_by_Great_Expectations" src="https://user-images.githubusercontent.com/928247/103230647-30eca500-4

Under the hood, Benthos csv input uses the standard encoding/csv packages's csv.Reader struct.

The current implementation of csv input doesn't allow setting the LazyQuotes field.

We have a use case where we need to set the LazyQuotes field in order to make things work correctly.

Expected Behavior

Feast should allow users to create feature views with .csv data sources and retrieve features from offline store without any issues.

Current Behavior

Presently, I have a .csv file sitting in S3 bucket and I am able to create a feature view using this .csv file but while fetching the features from offline store getting below error

-------------------------

When there are not enough results, we tell the user that the experiment just started, so come back later. When the experiment dates are set to a future time, this language doesn't fit very well. We should adjust the language to take this future state into account when figuring out the message.

<img width="875" alt="CleanShot 2022-04-10 at 21 23 22@2x" src="https://user-images.githubusercontent

On more advanced versions of LakeFS (probably > = v1.0.0), we would like to remove the logic that tries to fill the generation field in DB when loading old dumps. It means we will no longer support loading dump that made with a version lower than v0.61.0.

When expanding the left bar in the docs, the bottom sections are hidden - even if you scroll down, you won't see the Community section:

(1) Add docstrings to methods
(2) Covert .format() methods to f strings for readability
(3) Make sure we are using Python 3.8 throughout
(4) zip extract_all() in ingest_flights.py can be simplified with a Path parameter

Let's prepare a mixin for interacting with Roles and Policies with the Python client, in case users want to use the API directly.

Do not only have the list, get etc, but also utility methods, such as updating a default role. It should wrap the following logic:

import requests
import json

# Get the ID
data_consumer = requests.get("http://localhost:8585/api/v1/roles/name/DataCo

Background

This thread is borne out of the discussion from #968 , in an effort to make documentation more beginner-friendly & more understandable.
One of the subtasks mentioned in that thread was to go through the function docstrings and include a minimal working example to each of the public functions in pyjanitor.

Criteria reiterated here for the benefit of discussion:

It sh

if they are not class methods then the method would be invoked for every test and a session would be created for each of those tests.

`class PySparkTest(unittest.TestCase):
@classmethod
def suppress_py4j_logging(cls):
logger = logging.getLogger('py4j')
logger.setLevel(logging.WARN)

@classmethod
def create_testing_pyspark_session(cls):
    return Sp

data-engineering

Here are 1,252 public repositories matching this topic...

apache / superset

How to reproduce the bug

eugeneyan / applied-ml

andkret / Cookbook

datastacktv / data-engineer-roadmap

PrefectHQ / prefect

Current behavior

Proposed behavior

airbytehq / airbyte

great-expectations / great_expectations

dagster-io / dagster

benthosdev / benthos

DataTalksClub / data-engineering-zoomcamp

feast-dev / feast

Expected Behavior

Current Behavior

growthbook / growthbook

awslabs / aws-data-wrangler

treeverse / lakeFS

kestra-io / kestra

ploomber / ploomber

adilkhash / Data-Engineering-HowTo

kantord / just-dashboard

metarank / metarank

quiltdata / quilt

benthecoder / yt-channels-DS-AI-ML-CS

GoogleCloudPlatform / data-science-on-gcp

san089 / goodreads_etl_pipeline

open-metadata / OpenMetadata

pyjanitor-devs / pyjanitor

Background

AlexIoannides / pyspark-example-project

abhishek-ch / around-dataengineering

sodadata / soda-core

san089 / Udacity-Data-Engineering-Projects

oleg-agapov / data-engineering-book

Improve this page

Add this topic to your repo