data-engineering
Here are 984 public repositories matching this topic...
-
Updated
Nov 1, 2021
-
Updated
Aug 14, 2021
-
Updated
May 28, 2021
Opened from the Prefect Public Slack Community
michael.ball: Hey there. I’ve been playing around with Docker storage today, trying to get all source code packaged together with the flows each time they are registered, and am using the files
and env_vars
attributes as outlined in the Docs. But it seems that my .dockerignore
file (in the directory from whic
Describe the bug
data docs columns shrink to 1 character width with long query
To Reproduce
Steps to reproduce the behavior:
- make a batch from a long query string
- run validation
- render result to data docs
- See screenshot
<img width="1525" alt="Data_documentation_compiled_by_Great_Expectations" src="https://user-images.githubusercontent.com/928247/103230647-30eca500-4
Tell us about the problem you're trying to solve
We can probably reduce the docker image size of our java based connectors by using the ADD command instead of COPYing the tar archive. See this PR for an example
Describe the solution you’d like
use the ADD command to reduce the size of the docker images
-
Updated
Nov 3, 2021 - Go
Expected Behavior
Feature views should have the creation time (i.e., created_timestamp
) at the first feast apply
Current Behavior
Features Views do not have creation time at feature view creation
Steps to reproduce
feast init fs
cd fs
feast apply
feast registry-dump
{
"spec": {
"name": "driver_id",
"valueType": "INT64",
"description": "driver
-
Updated
Nov 3, 2021 - Python
-
Updated
Oct 29, 2021
Steps to reproduce:
- From the UI, create a repository.
- Upload a file.
- From the uncommitted tab, commit the change.
- From the Objects tab, click the "branch: main" drop down.
- Click the arrow on the right.
- Select the first commit with the "Repository created" message.
Result: the "get started" screen appears.
Expected: screen should be empty, because this is a past commit.
-
Updated
Aug 2, 2021 - JavaScript
-
Updated
Nov 3, 2021 - Jupyter Notebook
-
Updated
Nov 3, 2021 - Jupyter Notebook
-
Updated
Oct 24, 2021
-
Updated
Mar 9, 2020 - Python
if they are not class methods then the method would be invoked for every test and a session would be created for each of those tests.
`class PySparkTest(unittest.TestCase):
@classmethod
def suppress_py4j_logging(cls):
logger = logging.getLogger('py4j')
logger.setLevel(logging.WARN)
@classmethod
def create_testing_pyspark_session(cls):
return Sp
Hi ,
I am using some basic functions from pyjanitor such as - clean_names() , collapse_levels() in one of my code which I want to productionise.
And there are limitations on the size of the production code base.
Currently ,if I just look at the requirements.txt for just "pyjanitor" , its huge .
I don't think I require all the dependencies in my code.
How can I remove the unnecessary ones ?
-
Updated
Nov 3, 2021
The load_dotted_path
raises the following error if unable to load the module:
Traceback (most recent call last):
File "/Users/Edu/Desktop/import-error/script.py", line 4, in <module>
load_dotted_path('tests.quality.fn')
File "/Users/Edu/dev/ploomber/src/ploomber/util/dotted_path.py", line 128, in load_dotted_path
module = importlib.import_module(mod)
File "/Users/
-
Updated
Jun 2, 2021
-
Updated
Mar 5, 2020 - Python
-
Updated
Oct 25, 2021
-
Updated
Aug 4, 2021 - Ruby
-
Updated
Nov 3, 2021 - Python
When using Ubuntu 'ootb' both natively and within windows WSL2 the asset consumer fvt has a tendency to fail with:
[INFO] --- maven-compiler-plugin:3.8.1:compile (default-compile) @ asset-consumer-fvt ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 7 source files to /home/nigel/src/egeria/open-metadata-test/open-metadata-fvt/access-services-fvt/asset-consumer-fvt/tar
-
Updated
Oct 26, 2021 - TypeScript
-
Updated
Feb 7, 2021 - CSS
-
Updated
Nov 29, 2018 - Java
Improve this page
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."
Currently, the funnel report percentage is calculated using:
The number at a given funnel step /
Sum(everything in the funnel)
Example from blog:

Here, the Discussed Pricing (900) gets divided by 11900 (sum of all ev