-
Updated
Feb 9, 2022 - Python
distributed-training
Here are 70 public repositories matching this topic...
-
Updated
Feb 2, 2022 - Python
I have the same hardware envs, same network, but I could not get the result as you, almost half as you. Any best practices and experience? thanks very much! for bytePS with 1 instance and 8 GPU, I have similar testing result.
-
Updated
Feb 9, 2022 - Python
-
Updated
Nov 13, 2021 - Python
Simple mistakes trigger unclear error messages in the ALBERT example, that is:
- Absence of the unpacked data for trainer (currently triggers
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/models/data/tokenizer
) - Running all peers in
--client_mode
(currently triggersAllReduce failed: could not find a group
)
It would be great to
-
Updated
Feb 9, 2022 - Python
torchtext (as of 0.4.0) adopts torch.utils.data.DataLoader
, and the older iterator interface is deprecated. Ensure AdaptDL's AdaptiveDataLoader
supports this new torchtext interface for data loading, and port the example transformer code to the new interface. Then, adaptdl.data.iterator
can be deprecated/removed.
-
Updated
Mar 12, 2020 - Python
-
Updated
Jan 31, 2022 - Go
-
Updated
May 13, 2019
-
Updated
Feb 9, 2022 - Python
-
Updated
Aug 7, 2020 - Python
-
Updated
Nov 19, 2018 - Python
Does HyperGBM's make_experiment return the best model?
How does it work on paramter tuning? It's say that, what's its seach space (e.g. in XGboost)???
-
Updated
Aug 10, 2021 - Python
-
Updated
Feb 9, 2022 - C++
-
Updated
Feb 9, 2022 - Python
-
Updated
May 8, 2021 - C++
-
Updated
Jan 27, 2022 - Python
-
Updated
Sep 7, 2020 - Python
-
Updated
Feb 9, 2022 - Python
-
Updated
Jun 11, 2020 - Python
-
Updated
Dec 15, 2021 - C++
-
Updated
Jun 13, 2020 - Jupyter Notebook
-
Updated
Apr 28, 2021 - Python
-
Updated
Feb 7, 2022 - JavaScript
-
Updated
Sep 23, 2020 - Jupyter Notebook
Improve this page
Add a description, image, and links to the distributed-training topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the distributed-training topic, visit your repo's landing page and select "manage topics."
We would like to forward a particular 'key' column which is part of the features to appear alongside the predictions - this is to be able to identify to which set of features a particular prediction belongs to. Here is an example of predictions output using the tensorflow.contrib.estimator.multi_class_head: