kiwi.lib.predict
¶
Module Contents¶
Classes¶
Base class for all pydantic configs. Used to configure base behaviour of configs. |
|
Base class for all pydantic configs. Used to configure base behaviour of configs. |
Functions¶
|
Load a pretrained system (model) into a Runner object. |
|
Run the entire prediction pipeline using the configuration options received. |
|
Run the prediction pipeline. |
|
Make predictions over the validation set using the best model created during |
|
Prepare for running the prediction pipeline. |
-
kiwi.lib.predict.
logger
¶
-
class
kiwi.lib.predict.
RunConfig
¶ Bases:
kiwi.utils.io.BaseConfig
Base class for all pydantic configs. Used to configure base behaviour of configs.
-
seed
:int = 42¶ Random seed
-
run_id
:str¶ If specified, MLflow/Default Logger will log metrics and params under this ID. If it exists, the run status will change to running. This ID is also used for creating this run’s output directory. (Run ID must be a 32-character hex string).
-
output_dir
:Path¶ Output several files for this run under this directory. If not specified, a directory under “runs” is created or reused based on the Run UUID.
-
predict_on_data_partition
:Literal['train', 'valid', 'test'] = test¶ Name of the data partition to predict upon. File names are read from the corresponding
data
configuration field.
-
check_consistency
(cls, v, values)¶
-
-
class
kiwi.lib.predict.
Configuration
¶ Bases:
kiwi.utils.io.BaseConfig
Base class for all pydantic configs. Used to configure base behaviour of configs.
-
run
:RunConfig¶
-
data
:WMTQEDataset.Config¶
-
system
:QESystem.Config¶
-
use_gpu
:bool = False¶ If true and only if available, use the CUDA device specified in
gpu_id
or the first CUDA device. Otherwise, use the CPU.
-
gpu_id
:Optional[int]¶ Use CUDA on the listed device, only if
use_gpu
is true.
-
verbose
:bool = False¶
-
quiet
:bool = False¶
-
enforce_loading
(cls, v)¶
-
setup_gpu
(cls, v)¶
-
setup_gpu_id
(cls, v, values)¶
-
-
kiwi.lib.predict.
load_system
(system_path: Union[str, Path], gpu_id: Optional[int] = None)¶ Load a pretrained system (model) into a Runner object.
- Parameters
system_path – A path to the saved checkpoint file produced by a training run.
gpu_id – id of the gpu to load the model into (-1 or None to use CPU)
- Throws:
Exception: If the path does not exist, or is not a valid system file.
-
kiwi.lib.predict.
predict_from_configuration
(configuration_dict: Dict[str, Any])¶ Run the entire prediction pipeline using the configuration options received.
-
kiwi.lib.predict.
run
(config: Configuration, output_dir: Path) → Tuple[Dict[str, List], Optional[MetricsReport]]¶ Run the prediction pipeline.
Load the model and necessary files and create the model’s predictions for the configured data partition.
- Parameters
config – validated configuration values for the (predict) pipeline.
output_dir – directory where to save predictions.
- Returns
Dictionary with format {‘target’: predictions}
- Return type
-
kiwi.lib.predict.
make_predictions
(output_dir: Path, best_model_path: Path, data_partition: Literal[‘train’, ‘valid’, ‘test’], data_config: WMTQEDataset.Config, outputs_config: QEOutputs.Config = None, batch_size: Union[int, BatchSizeConfig] = None, num_workers: int = 0, gpu_id: int = None)¶ Make predictions over the validation set using the best model created during training.
- Parameters
output_dir – output Directory where predictions should be saved.
best_model_path – path pointing to the checkpoint with best performance.
data_partition – on which dataset to predict (one of ‘train’, ‘valid’, ‘test’).
data_config – configuration containing options for the
data_partition
set.outputs_config – configuration specifying which outputs to activate.
batch_size – for predicting.
num_workers – number of parallel data loaders.
gpu_id – GPU to use for predicting; 0 for CPU.
- Returns
predictions}.
- Return type
dictionary with predictions in the format {‘target’
-
kiwi.lib.predict.
setup_run
(config: RunConfig, quiet=False, debug=False, anchor_dir: Path = None) → Path¶ Prepare for running the prediction pipeline.
This includes setting up the output directory, random seeds, and loggers.
- Parameters
config – configuration options.
quiet – whether to suppress info log messages.
debug – whether to additionally log debug messages (:param:`quiet` has precedence)
anchor_dir – directory to use as root for paths.
- Returns
the resolved path to the output directory.