-
Updated
Nov 30, 2021 - Python
feature-engineering
Here are 1,225 public repositories matching this topic...
-
Updated
Jul 1, 2021 - Python
-
Updated
Nov 26, 2021 - Java
When specifying on demand feature views at retrieval time (e.g. get_X_features), the output feature vectors include e.g. request data or dependent feature vectors, even if users did not specify said features.
Expected Behavior
Non-specified dependent feature values are not returned in output
Current Behavior
Non-specified dependent feature values are in output
Steps to reprodu
Now we are using default spark catalog to load tables from hive metastore.
We should test and use the non-default spark catalog to do that and make sure all the user tables can be loaded for OpenMLDB session.
Problem
Some of our transformers & estimators are not thoroughly tested or not tested at all.
Solution
Use OpTransformerSpec
and OpEstimatorSpec
base test specs to provide tests for all existing transformers & estimators.
There are several evaluation metrics that would be particularly beneficial for (binary) imbalanced classification problems and would be greatly appreciated additions. In terms of prioritizing implementation (and likely ease of implementation I will rank-order these):
- AUCPR - helpful in the event that class labels are needed and the positive class is of greater importance.
- **F2 Scor
-
Updated
Feb 10, 2021 - Python
-
Updated
Mar 25, 2021 - Python
-
Updated
Feb 14, 2017 - Jupyter Notebook
-
Updated
Dec 20, 2017 - Python
-
Updated
Feb 4, 2021 - Jupyter Notebook
At the moment, in the categorical tree encoder and the tree discretiser, we have an argument is_regression that the user needs to fill in in order to detect if the user is aiming to perform classification or regression.
Sklearn has an automated process with the is_classification (see Decision tree source code).
Can we bring this functionality to feature-engine?
I think we can :p
-
Updated
Jun 16, 2021 - Python
-
Updated
Jan 20, 2021 - Python
-
Updated
Dec 15, 2018 - Jupyter Notebook
-
Updated
Nov 29, 2021 - Python
-
Updated
May 8, 2019 - Python
-
Updated
Oct 26, 2018
-
Updated
Oct 23, 2021 - Jupyter Notebook
Just reviewing the docs and found this under the AutoML User Guide:
We should figure out a way to deal with this kind of thing. I think a couple of options here are:
- Modifying the cell to only show the first few keys or so of the output.
- Modifying the output cell so that
-
Updated
Nov 11, 2021 - Jupyter Notebook
-
Updated
Jul 1, 2019 - Python
-
Updated
Oct 20, 2021 - Jupyter Notebook
-
Updated
Nov 29, 2021 - Python
-
Updated
Sep 19, 2021 - Python
-
Updated
Mar 16, 2021 - Python
-
Updated
Nov 29, 2020 - Jupyter Notebook
Improve this page
Add a description, image, and links to the feature-engineering topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the feature-engineering topic, visit your repo's landing page and select "manage topics."
Improvement