Fix OpWorkflowModelLocalTest due to flaky XGBoost training #494

TuanNguyen27 · 2020-07-20T16:04:09Z

Source of flakiness: default BinaryClassificationModelSelector.withTrainValidationSplit sometimes makes the training set contain only positive or negative labels, which fails the training for xgboost.

We address this flakiness by fixing the seed in the DataSplitter for withTrainValidationSplit, which will result in the same train-test split every time the test is run.

ml.dmlc.xgboost4j.java.XGBoostError: [16:55:13] /xgboost/src/metric/rank_metric.cc:515: Check failed: !auc_error: AUC-PR: the dataset only contains pos or neg samples

codecov · 2020-07-20T16:33:07Z

Codecov Report

Merging #494 into master will decrease coverage by 3.80%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #494      +/-   ##
==========================================
- Coverage   82.63%   78.83%   -3.81%     
==========================================
  Files         345      345              
  Lines       11702    11702              
  Branches      388      388              
==========================================
- Hits         9670     9225     -445     
- Misses       2032     2477     +445

Impacted Files	Coverage Δ
...scala/com/salesforce/op/utils/text/TextUtils.scala	`0.00% <0.00%> (-100.00%)`	⬇️
.../scala/com/salesforce/op/test/FeatureAsserts.scala	`0.00% <0.00%> (-100.00%)`	⬇️
...ala/com/salesforce/op/readers/CSVAutoReaders.scala	`0.00% <0.00%> (-100.00%)`	⬇️
...la/com/salesforce/op/test/TestFeatureBuilder.scala	`0.00% <0.00%> (-100.00%)`	⬇️
...om/salesforce/op/stages/impl/feature/OpNGram.scala	`0.00% <0.00%> (-100.00%)`	⬇️
...alesforce/op/stages/impl/feature/OpHashingTF.scala	`0.00% <0.00%> (-100.00%)`	⬇️
...lesforce/op/stages/impl/feature/LangDetector.scala	`0.00% <0.00%> (-100.00%)`	⬇️
...sforce/op/aggregators/CustomMonoidAggregator.scala	`0.00% <0.00%> (-100.00%)`	⬇️
...sforce/op/stages/base/binary/BinaryEstimator.scala	`0.00% <0.00%> (-100.00%)`	⬇️
...e/op/stages/impl/feature/TextMapLenEstimator.scala	`0.00% <0.00%> (-100.00%)`	⬇️
... and 111 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f764842...2313d09. Read the comment docs.

nicodv

LGTM

…ifAI into fixFlakyXGB

gerashegalov · 2020-07-20T21:42:55Z

Add a description of flakiness, and how your fix addresses it

Jauntbox

LGTM

…of the same class

Update OpWorkflowModelLocalTest.scala

Loading status checks…

6a552c8

TuanNguyen27 requested review from gerashegalov, Jauntbox, leahmcguire, tovbinm and wsuchy as code owners Jul 20, 2020

salesforce-cla bot added the cla:signed label Jul 20, 2020

TuanNguyen27 requested a review from nicodv Jul 20, 2020

Merge branch 'master' into fixFlakyXGB

Verified

This commit was created on GitHub.com and signed with a verified signature using GitHub’s key.

GPG key ID: 4AEE18F83AFDEB23 Learn about signing commits

Loading status checks…

43e1915

nicodv approved these changes Jul 20, 2020

View changes

TuanNguyen27 added 2 commits Jul 20, 2020

force xgb to recoginze 2 classes

8827ae2

Merge branch 'fixFlakyXGB' of https://github.com/salesforce/Transmogr…

Loading status checks…

5910192

…ifAI into fixFlakyXGB

Update OpWorkflowModelLocalTest.scala

Loading status checks…

ebb6761

Jauntbox approved these changes Jul 23, 2020

View changes

fix seed in dataSplitter to prevent training set from containing all …

Loading status checks…

2313d09

…of the same class

TuanNguyen27 deleted the fixFlakyXGB branch Jul 23, 2020

TuanNguyen27 mentioned this pull request Jul 23, 2020

Fix flaky XGB take 2 #498

Closed

salesforce / TransmogrifAI

Fix OpWorkflowModelLocalTest due to flaky XGBoost training #494

Fix OpWorkflowModelLocalTest due to flaky XGBoost training #494

TuanNguyen27 commented Jul 20, 2020 •

edited

codecov bot commented Jul 20, 2020 •

edited

nicodv left a comment

gerashegalov commented Jul 20, 2020

Jauntbox left a comment

salesforce / TransmogrifAI

Join GitHub today

GitHub is where the world builds software

Fix OpWorkflowModelLocalTest due to flaky XGBoost training #494

Fix OpWorkflowModelLocalTest due to flaky XGBoost training #494

Conversation

TuanNguyen27 commented Jul 20, 2020 • edited

codecov bot commented Jul 20, 2020 • edited

Codecov Report

nicodv left a comment

gerashegalov commented Jul 20, 2020

Jauntbox left a comment

Essential cookies

Always active

Analytics cookies

TuanNguyen27 commented Jul 20, 2020 •

edited

codecov bot commented Jul 20, 2020 •

edited