data-engineering

The Mixed Time-Series chart type allows for configuring the title of the primary and the secondary y-axis.
However, while only the title of the primary axis is shown next to the axis, the title of the secondary one is placed at the upper end of the axis where it gets hidden by bar values and zoom controls.

How to reproduce the bug

Create a mixed time-series chart
Configure axi

Hello!

I've found an issue here:[Bitbucket Storage](https://docs.prefect.io/api/latest/storage.html#github) is a storage option that uploads flows to a Bitbucket repository as .py files.

Page reference: https://docs.prefect.io/orchestration/flow_config/storage.html#bitbucket

First, the link is incorrect. Second, should the line read more like Github where it references the repository

Tasks:

Port the content from GH readme to Docusaurus (main Docs website)
Incorporate relevant CLI content into the Getting Started with Airbyte OSS guide
Identify other places in Docs where we can incorporate CLI content

Describe the bug
data docs columns shrink to 1 character width with long query

To Reproduce
Steps to reproduce the behavior:

make a batch from a long query string
run validation
render result to data docs
See screenshot
<img width="1525" alt="Data_documentation_compiled_by_Great_Expectations" src="https://user-images.githubusercontent.com/928247/103230647-30eca500-4

Is your feature request related to a problem? Please describe.
The current Feast online store for GCP implementation requires Firestore in Datastore mode. Firestore can only be in one mode at a time per GCP account. You cannot use native mode for some applications and Datastore mode for others within the same account. Adding a feature store to an existing GCP account that uses native mode wou

Currently when you sort the list of features (or experiments) it only kept for that session and not saved. We would like to persist this sort state to local storage.

webui/src/lib/api/index.js

I added a small tutorial (triggered with python -m ploomber.onboard) but it doesn't have tests yet

(1) Add docstrings to methods
(2) Covert .format() methods to f strings for readability
(3) Make sure we are using Python 3.8 throughout
(4) zip extract_all() in ingest_flights.py can be simplified with a Path parameter

Let's prepare a mixin for interacting with Roles and Policies with the Python client, in case users want to use the API directly.

Do not only have the list, get etc, but also utility methods, such as updating a default role. It should wrap the following logic:

import requests
import json

# Get the ID
data_consumer = requests.get("http://localhost:8585/api/v1/roles/name/DataCo

Hi ,

I am using some basic functions from pyjanitor such as - clean_names() , collapse_levels() in one of my code which I want to productionise.
And there are limitations on the size of the production code base.
Currently ,if I just look at the requirements.txt for just "pyjanitor" , its huge .
I don't think I require all the dependencies in my code.
How can I remove the unnecessary ones ?

if they are not class methods then the method would be invoked for every test and a session would be created for each of those tests.

`class PySparkTest(unittest.TestCase):
@classmethod
def suppress_py4j_logging(cls):
logger = logging.getLogger('py4j')
logger.setLevel(logging.WARN)

@classmethod
def create_testing_pyspark_session(cls):
    return Sp

data-engineering

Here are 1,294 public repositories matching this topic...

apache / superset

How to reproduce the bug

eugeneyan / applied-ml

andkret / Cookbook

datastacktv / data-engineer-roadmap

PrefectHQ / prefect

airbytehq / airbyte

great-expectations / great_expectations

dagster-io / dagster

DataTalksClub / data-engineering-zoomcamp

benthosdev / benthos

feast-dev / feast

growthbook / growthbook

awslabs / aws-data-wrangler

treeverse / lakeFS

kestra-io / kestra

ploomber / ploomber

adilkhash / Data-Engineering-HowTo

kantord / just-dashboard

metarank / metarank

quiltdata / quilt

benthecoder / yt-channels-DS-AI-ML-CS

GoogleCloudPlatform / data-science-on-gcp

open-metadata / OpenMetadata

san089 / goodreads_etl_pipeline

pyjanitor-devs / pyjanitor

AlexIoannides / pyspark-example-project

abhishek-ch / around-dataengineering

sodadata / soda-core

san089 / Udacity-Data-Engineering-Projects

oleg-agapov / data-engineering-book

Improve this page

Add this topic to your repo