ENH Monotonic Contraints for Tree-based models #13649

samronsin · 2019-04-15T22:29:01Z

Continuation of PR #7266 addressing issue #6656.

TODO:

~~implement logic for new trees in [MRG+2] Faster Gradient Boosting Decision Trees with binned features #12807~~ (already done by [MRG] Monotonic constraints for GBDT #15582)
discuss API (not sure about introducing 2 new kwargs increasing and decreasing) => let's go with the montonic_cst array
add tests
improve tests as the current ones don't fail despite erroneous implementation

DecisionTrees RandomForests GradientBoosting

NicolasHug · 2019-04-15T22:46:59Z

Very interested in this, please ping me when you have something!

sklearn/ensemble/forest.py

samronsin · 2019-06-17T16:47:16Z

I added a few tests to check that the implementation actually enforced the monotonicity constraints, but they are failing...
Unless my tests are wrong, this suggests that the current implementation is not correct.

samronsin · 2019-06-20T17:31:32Z

Two issues:

montonicity for classification is with respect to class 0 => switch to class 1 ?
a small fraction of rows (~5% for 1000 rows, which is hidden in the current tests...) fail to satisfy the monotonicity constraints -- I suspect this could be due to unfortunate float64 to float32 conversions at during the predict call

NicolasHug · 2020-01-29T12:58:10Z

@samronsin LMK when you need reviews.

I've implemented monotonic constraints for the hist-GBDTs in #15582

It's still under review so your comments are more than welcome there. I think you'll also find some tests worth using. They helped me a great deal in debugging my code

NicolasHug · 2020-01-29T13:05:59Z

montonicity for classification is with respect to class 0 => switch to class 1 ?

Yes. Maybe this be done by just reversing the constraint? Not sure

a small fraction of rows (~5% for 1000 rows, which is hidden in the current tests...) fail to satisfy the monotonicity constraints -- I suspect this could be due to unfortunate float64 to float32 conversions at during the predict call

You don't seem to be constraining the values of the children based on the average value of a pair of siblings. You'll need to introduce lower and upper bounds to the nodes values for monotonic constraints to be properly enforced (see https://github.com/scikit-learn/scikit-learn/pull/15582/files#diff-9bd2ee07fb6817a0aee54f959d32de8aR139).

Some tests aren't passing for reasons given in this thread: https://github.com/scikit-learn/scikit-learn/pull/13649/files#r973630373 Olivier and I agreed that this support can be postponed and integrated in another pull request. Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Prefer: - "monotonicity constraints" to refer to all of constraints (encoded by -1, 0, 1) which replaces "monotonic constraints" - "monotonic increase constraints" (encoded by 1) which replaces "positive constraints", "monotonically increasing" and "monotonically increase" - "monotonic decrease constraints" (encoded by -1) which replaces "negative constraints", "monotonically decreasing" and "monotonically decrease" Note: "monotonically {de,in}creasing" is valid when using continuous tenses but was improperly used before. See: scikit-learn#13649 (comment) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

jjerphan · 2022-10-28T13:11:33Z

Note that I have just force-pushed-with-lease after approval IRL from @ogrisel to resolve the introduction of several merge commits of main after twp concurrent merge of main in this PR branch by Olivier and myself.

Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

…!= 0" This reverts commit bfe94af.

This does not reallocate placeholder null'ed numpy arrays, and perform operation only when needed. Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Also simplify the logic and its comment. Note: we could factor this duplicated logic in a private function or a private method, but we would need to use pointers to modify variable on the program stack because we can't simply return several values. This factorization might be more complex than the current duplication, hence we propose leaving it as is. See: https://github.com/scikit-learn/scikit-learn/pull/13649/files#r973633867 Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

jjerphan

I think there are just two tasks to perform before merging this PR:

testing assert_1d_reg_tree_children_monotonic_bounded and assert_nd_reg_tree_children_monotonic_bounded on cases violating constraints
asserting any significant performance regression using the ASV benchmarking suite

@ogrisel: I am doing 2.; are you interested in doing 1. following your proposal in https://github.com/scikit-learn/scikit-learn/pull/13649/files#r1006829892?

sklearn/tree/_tree.pyx

jjerphan · 2022-11-02T16:03:56Z

No performance regression observed with asv.

Full asv report

$ asv continuous -b RandomF -e main monotonic-trees | tee ~/asv-bench-monotonic-tree.log                                                                                                                                                                                       
· Creating environments
· Discovering benchmarks
·· Uninstalling from conda-py3.9-cython-joblib-numpy-pandas-scipy-threadpoolctl
·· Building 3558bfd0 <monotonic-trees> for conda-py3.9-cython-joblib-numpy-pandas-scipy-threadpoolctl
·· Installing 3558bfd0 <monotonic-trees> into conda-py3.9-cython-joblib-numpy-pandas-scipy-threadpoolctl
· Running 12 total benchmarks (2 commits * 1 environments * 6 benchmarks)
[  0.00%] · For scikit-learn commit 3558bfd0 <monotonic-trees> (round 1/1):
[  0.00%] ·· Benchmarking conda-py3.9-cython-joblib-numpy-pandas-scipy-threadpoolctl
[  8.33%] ··· Setting up ensemble:24                                          ok
[  8.33%] ··· ....RandomForestClassifierBenchmark.peakmem_fit                 ok
[  8.33%] ··· ================ ======
              --               n_jobs
              ---------------- ------
               representation    1   
              ================ ======
                   dense        188M 
                   sparse       438M 
              ================ ======

[ 16.67%] ··· ...domForestClassifierBenchmark.peakmem_predict                 ok
[ 16.67%] ··· ================ ======
              --               n_jobs
              ---------------- ------
               representation    1   
              ================ ======
                   dense        191M 
                   sparse       418M 
              ================ ======

[ 25.00%] ··· ...ble.RandomForestClassifierBenchmark.time_fit                 ok
[ 25.00%] ··· ================ ============
              --                  n_jobs   
              ---------------- ------------
               representation       1      
              ================ ============
                   dense         5.92±0s   
                   sparse       10.2±0.01s 
              ================ ============

[ 33.33%] ··· ...RandomForestClassifierBenchmark.time_predict                 ok
[ 33.33%] ··· ================ ===========
              --                  n_jobs  
              ---------------- -----------
               representation       1     
              ================ ===========
                   dense        177±0.4ms 
                   sparse        1.19±0s  
              ================ ===========

[ 41.67%] ··· ...omForestClassifierBenchmark.track_test_score                 ok
[ 41.67%] ··· ================ ====================
              --                      n_jobs       
              ---------------- --------------------
               representation           1          
              ================ ====================
                   dense        0.7539314095652613 
                   sparse       0.8656423941766682 
              ================ ====================

[ 50.00%] ··· ...mForestClassifierBenchmark.track_train_score                 ok
[ 50.00%] ··· ================ ====================
              --                      n_jobs       
              ---------------- --------------------
               representation           1          
              ================ ====================
                   dense        0.9963562897954773 
                   sparse       0.9996123288718864 
              ================ ====================

[ 50.00%] · For scikit-learn commit b7f3fd68 <main> (round 1/1):
[ 50.00%] ·· Building for conda-py3.9-cython-joblib-numpy-pandas-scipy-threadpoolctl
[ 50.00%] ·· Benchmarking conda-py3.9-cython-joblib-numpy-pandas-scipy-threadpoolctl
[ 58.33%] ··· Setting up ensemble:24                                          ok
[ 58.33%] ··· ....RandomForestClassifierBenchmark.peakmem_fit                 ok
[ 58.33%] ··· ================ ======
              --               n_jobs
              ---------------- ------
               representation    1   
              ================ ======
                   dense        184M 
                   sparse       433M 
              ================ ======

[ 66.67%] ··· ...domForestClassifierBenchmark.peakmem_predict                 ok
[ 66.67%] ··· ================ ======
              --               n_jobs
              ---------------- ------
               representation    1   
              ================ ======
                   dense        187M 
                   sparse       414M 
              ================ ======

[ 75.00%] ··· ...ble.RandomForestClassifierBenchmark.time_fit                 ok
[ 75.00%] ··· ================ =========
              --                 n_jobs 
              ---------------- ---------
               representation      1    
              ================ =========
                   dense        5.91±0s 
                   sparse       10.3±0s 
              ================ =========

[ 83.33%] ··· ...RandomForestClassifierBenchmark.time_predict                 ok
[ 83.33%] ··· ================ ===========
              --                  n_jobs  
              ---------------- -----------
               representation       1     
              ================ ===========
                   dense        177±0.4ms 
                   sparse        1.19±0s  
              ================ ===========

[ 91.67%] ··· ...omForestClassifierBenchmark.track_test_score                 ok
[ 91.67%] ··· ================ ====================
              --                      n_jobs       
              ---------------- --------------------
               representation           1          
              ================ ====================
                   dense        0.7539314095652613 
                   sparse       0.8656423941766682 
              ================ ====================

[100.00%] ··· ...mForestClassifierBenchmark.track_train_score                 ok
[100.00%] ··· ================ ====================
              --                      n_jobs       
              ---------------- --------------------
               representation           1          
              ================ ====================
                   dense        0.9963562897954773 
                   sparse       0.9996123288718864 
              ================ ====================


BENCHMARKS NOT SIGNIFICANTLY CHANGED.

ogrisel · 2022-11-04T18:30:04Z

Working on a test (that fails) for assert_nd_reg_tree_children_monotonic_bounded in 5926abe I discovered that the current implementation of monotonic_cst for trees is too constrained and rejects valid splits.

Here is a reproducer:

Setup:

import numpy as np
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import HistGradientBoostingRegressor
import matplotlib.pyplot as plt

X = np.linspace(-5, 5, 5).reshape(-1, 1)
y = X.ravel() ** 3

X_test = np.linspace(-6, 6, 100).reshape(-1, 1)

single tree without constraint on a 5 datapoint training set:

reg = DecisionTreeRegressor(max_depth=None, random_state=0).fit(X, y)
plt.plot(X_test.ravel(), reg.predict(X_test))

looks good, and naturally monotonic as expected. Let's add the monotonicity constraint:

single tree with constraint:

reg = DecisionTreeRegressor(max_depth=None, random_state=0, monotonic_cst=[1]).fit(X, y)
plt.plot(X_test.ravel(), reg.predict(X_test))

This seems overly constrained! The previous decision function should be admissible on this training set.

Let's see if we have the same problem with the implementation in the hist gradient boosting trees:

first without the constraint:

reg = HistGradientBoostingRegressor(min_samples_leaf=1, max_iter=1, random_state=0).fit(
    X, y
)
plt.plot(X_test.ravel(), reg.predict(X_test))

all good. Let's add the constraint:

reg = HistGradientBoostingRegressor(
    min_samples_leaf=1, max_iter=1, monotonic_cst=[1], random_state=0
).fit(X, y)
plt.plot(X_test.ravel(), reg.predict(X_test))

So the monotonic_cst implementation of HistGradientBoostingRegressor does not have the problem.

I think the bound propagation logic of this PR is too restrictive as discovered in the failing test I just pushed.

I need more time to investigate how the 2 implementations differ.

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

samronsin · 2022-11-07T11:40:43Z

Thanks a lot @jjerphan and @ogrisel for all the work on this PR !

Regarding your latest findings about the overly constrained implementation @ogrisel: what is currently implemented guarantees that the trees are monotonic, i.e. a sufficient conditions for the monotonicity of the trees. However, these conditions are not necessary, as your example shows. Thus it is expected that some valid trees are discarded. I don't think that a necessary and sufficient condition can be implemented while reusing the current structures of the tree building.

As argued here: #13649 (comment)
if we relax these conditions, we can end up mistakenly building a non-monotonic tree. Also, unless I made a mistake, the implementation of the constraints in HistGradientBoosting do not implement sufficient conditions for monotonocity, although it looks like they work well in practice. Working on an example where they fail at discarding a non-monotonic tree (as you suggested here: #13649 (comment)) would be quite useful.

I hope to have a bit of time in the coming weeks to think about it !

ogrisel · 2022-11-07T14:51:32Z

I agree with your analysis that constraints implemented for trees have no guaranteed to be minimal to ensure monotonicity but I have the feeling that on such toy 1d problems, we could do better (as is the case in hist-gradient boosted trees).

I started to investigate with the following debug print statements:

samronsin/scikit-learn@monotonic-trees...ogrisel:scikit-learn:debug-monotonic-trees

For the trees in this PR with the code above I get (both with depth-first and best-first):

value: 0.0000, threshold: -3.7500,  bounds: -inf | 0.0000 || 0.0000 | inf
value: 31.2500, threshold: 3.7500,  bounds: 0.0000 | 31.2500 || 31.2500 | inf

while for the HGBRT I get:

value 0.0000, threshold: 0, bounds: -inf | -46.8750 || -46.8750 | inf
value 31.2500, threshold: 3, bounds: -46.8750 | 62.5000 || 62.5000 | inf
value -0.0000, threshold: 1, bounds: -46.8750 | -3.9062 || -3.9062 | 62.5000
value 7.8125, threshold: 2, bounds: -3.9062 | 7.8125 || 7.8125 | 62.5000

the discretized threshold meaning for the HGBRT can be recovered with:

>>> reg._bin_mapper.bin_thresholds_[0]
array([-3.75, -1.25,  1.25,  3.75])

So in conclusion the first 2 splits are identical and happen in the same order but the computed bounds are vastly different. But it's true that they do not predict the same targets. I think deeper investigation (logging) on when the constraints hold or not for each kind of model is needed. However I need to work on something else today so I will stop here fore now. Feel free to have a look.

In the mean time, I will remove the 1.2 milestone because I am not sure we can resolve this in time and I don't want to delay the release process.

jeremiedbb · 2022-11-24T13:19:36Z

In the mean time, I will remove the 1.2 milestone because I am not sure we can resolve this in time and I don't want to delay the release process.

Looks like it was not resolved indeed :) Moving it to 1.3

pat-oreilly and others added 9 commits September 2, 2016 16:49

Monotonic features in tree-based models

1b0a960

DecisionTrees RandomForests GradientBoosting

refactoring monotonic check into a function

953651f

tidy up monotonic check function

63c3f9c

fix typo 'decreasin'

6407e99

Merge branch 'master' into monotonic-trees

92f7537

Clean up dirty merge

119d5a3

PEP8

c331e82

PEP8

ab6d185

PEP8

ce9377f

jnothman reviewed Apr 15, 2019

View changes

sklearn/ensemble/forest.py Outdated Show resolved Hide resolved

samronsin added 6 commits April 25, 2019 23:25

Update docstrings for increasing/decreasing parameters

fe21579

Merge branch 'master' into monotonic-trees

642dcb0

Cleanup

544cced

Cleanup

20c7929

Merge branch 'master' into monotonic-trees

8cd9954

Add basic monotonicity tests

49747c6

samronsin added 5 commits June 17, 2019 18:48

PEP8

37b536b

PEP8

ef2fb1c

Simplify split_passes_monotonic_check function

072ef99

Fix split_passes_monotonic_check function

605e1c8

Fix tests (still something broken though)

018d6c2

amueller added the Needs work label Aug 6, 2019

samronsin added 3 commits January 29, 2020 11:30

Merge branch 'master' into monotonic-trees

347e2c6

Fix PEP8 and missing import in tests

b217ec8

Fix faulty Splitter call in _classes.py

be8b38a

jjerphan and others added 4 commits October 28, 2022 15:08

TST Parametrize tests and simplify parameters' lists

94d993c

MAINT Format ala black

41e443e

jjerphan force-pushed the monotonic-trees branch from ffb7a53 to 41e443e Compare October 28, 2022 13:09

jjerphan and others added 10 commits October 28, 2022 15:14

Remove irrelevant changes

bf0a106

fixup! DOC Clarify wording for constraints

ec9cb4a

Use global_random_seed in data generation for tests

0d58fe5

English grammar

23d4497

Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>

Optim: protect check_monotonicity under monotonic_constraint != 0

bfe94af

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

Revert "Optim: protect check_monotonicity under monotonic_constraint …

9e1f8b1

…!= 0" This reverts commit bfe94af.

Test clean-up

3fdae14

PERF Use monotonic_cst only is it is provided

7e33b05

This does not reallocate placeholder null'ed numpy arrays, and perform operation only when needed. Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

Improve comments in _get_middle_value

1cae8ee

jjerphan reviewed Nov 2, 2022

View changes

sklearn/tree/_tree.pyx Outdated Show resolved Hide resolved

jjerphan reviewed Nov 2, 2022

View changes

sklearn/tree/_tree.pyx Outdated Show resolved Hide resolved

ogrisel added 2 commits November 4, 2022 14:54

More intuitive assert_nd_reg_tree_children_monotonic_bounded

35ae0a3

WIP try to test assert_nd_reg_tree_children_monotonic_bounded

5926abe

Apply suggestions from code review

a650224

Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>

ogrisel mentioned this pull request Nov 7, 2022

Make it possible to specify interaction_cst and monotonic_cst with feature names. #24852

Closed

jeremiedbb modified the milestones: 1.2, 1.3 Nov 24, 2022

jjerphan mentioned this pull request Jan 18, 2023

CI Adapt handling of discarded fused typed memoryview #25425

Merged

ENH Monotonic Contraints for Tree-based models #13649

ENH Monotonic Contraints for Tree-based models #13649

samronsin commented Apr 15, 2019 •

edited

NicolasHug commented Apr 15, 2019

samronsin commented Jun 17, 2019

samronsin commented Jun 20, 2019

NicolasHug commented Jan 29, 2020

NicolasHug commented Jan 29, 2020 •

edited

jjerphan commented Oct 28, 2022

jjerphan left a comment

jjerphan commented Nov 2, 2022 •

edited

ogrisel commented Nov 4, 2022 •

edited

samronsin commented Nov 7, 2022

ogrisel commented Nov 7, 2022

jeremiedbb commented Nov 24, 2022

ENH Monotonic Contraints for Tree-based models #13649

Are you sure you want to change the base?

ENH Monotonic Contraints for Tree-based models #13649

Conversation

samronsin commented Apr 15, 2019 • edited

NicolasHug commented Apr 15, 2019

samronsin commented Jun 17, 2019

samronsin commented Jun 20, 2019

NicolasHug commented Jan 29, 2020

NicolasHug commented Jan 29, 2020 • edited

jjerphan commented Oct 28, 2022

jjerphan left a comment

Choose a reason for hiding this comment

jjerphan commented Nov 2, 2022 • edited

ogrisel commented Nov 4, 2022 • edited

samronsin commented Nov 7, 2022

ogrisel commented Nov 7, 2022

jeremiedbb commented Nov 24, 2022

samronsin commented Apr 15, 2019 •

edited

NicolasHug commented Jan 29, 2020 •

edited

jjerphan commented Nov 2, 2022 •

edited

ogrisel commented Nov 4, 2022 •

edited