torch_geometric.graphgym

Workflow and Register Modules 

`load_ckpt`	Loads the model checkpoint at a given epoch.
`save_ckpt`	Saves the model checkpoint at a given epoch.
`remove_ckpt`	Removes the model checkpoint at a given epoch.
`clean_ckpt`	Removes all but the last model checkpoint.
`parse_args`	Parses the command line arguments.
`cfg`
`set_cfg`	This function sets the default config value.
`load_cfg`	Load configurations from file system and command line.
`dump_cfg`	Dumps the config to the output directory specified in `cfg.out_dir`.
`set_run_dir`	Create the directory for each random seed experiment run.
`set_out_dir`	Create the directory for full experiment run.
`get_fname`	Extract filename from file name path.
`init_weights`	Performs weight initialization.
`create_loader`	Create data loader object.
`set_printing`	Set up printing options.
`create_logger`	Create logger for the experiment.
`compute_loss`	Compute loss and prediction score.
`create_model`	Create model for graph machine learning.
`create_optimizer`	Creates a config-driven optimizer.
`create_scheduler`	Creates a config-driven learning rate scheduler.
`train`	Trains a GraphGym model using PyTorch Lightning.
`register_base`	Base function for registering a module in GraphGym.
`register_act`	Registers an activation function in GraphGym.
`register_node_encoder`	Registers a node feature encoder in GraphGym.
`register_edge_encoder`	Registers an edge feature encoder in GraphGym.
`register_stage`	Registers a customized GNN stage in GraphGym.
`register_head`	Registers a GNN prediction head in GraphGym.
`register_layer`	Registers a GNN layer in GraphGym.
`register_pooling`	Registers a GNN global pooling/readout layer in GraphGym.
`register_network`	Registers a GNN model in GraphGym.
`register_config`	Registers a configuration group in GraphGym.
`register_dataset`	Registers a dataset in GraphGym.
`register_loader`	Registers a data loader in GraphGym.
`register_optimizer`	Registers an optimizer in GraphGym.
`register_scheduler`	Registers a learning rate scheduler in GraphGym.
`register_loss`	Registers a loss function in GraphGym.
`register_train`	Registers a training function in GraphGym.
`register_metric`	Register a metric function in GraphGym.

load_ckpt(model: Module, optimizer: Optional[Optimizer] = None, scheduler: Optional[Any] = None, epoch: int = -1) → int[source]

Loads the model checkpoint at a given epoch.

Return type:: int

save_ckpt(model: Module, optimizer: Optional[Optimizer] = None, scheduler: Optional[Any] = None, epoch: int = 0)[source]: Saves the model checkpoint at a given epoch.

remove_ckpt(epoch: int = -1)[source]: Removes the model checkpoint at a given epoch.

clean_ckpt()[source]: Removes all but the last model checkpoint.

parse_args() → Namespace[source]

Parses the command line arguments.

Return type:: Namespace

set_cfg(cfg)[source]

This function sets the default config value.

Note that for an experiment, only part of the arguments will be used The remaining unused arguments won’t affect anything. So feel free to register any argument in graphgym.contrib.config
We support at most two levels of configs, e.g., cfg.dataset.name.

Returns:: Configuration use by the experiment.

load_cfg(cfg, args)[source]

Load configurations from file system and command line.

Parameters:

cfg (CfgNode) – Configuration node
args (ArgumentParser) – Command argument parser

dump_cfg(cfg)[source]

Dumps the config to the output directory specified in cfg.out_dir.

Parameters:: cfg (CfgNode) – Configuration node

set_run_dir(out_dir)[source]

Create the directory for each random seed experiment run.

Parameters:: out_dir (str) – Directory for output, specified in cfg.out_dir

set_out_dir(out_dir, fname)[source]

Create the directory for full experiment run.

Parameters:

out_dir (str) – Directory for output, specified in cfg.out_dir
fname (str) – Filename for the yaml format configuration file

get_fname(fname)[source]

Extract filename from file name path.

Parameters:: fname (str) – Filename for the yaml format configuration file

init_weights(m)[source]

Performs weight initialization.

Parameters:: m (nn.Module) – PyTorch module

create_loader()[source]

Create data loader object.

Returns: List of PyTorch data loaders

set_printing()[source]: Set up printing options.

create_logger()[source]: Create logger for the experiment.

compute_loss(pred, true)[source]

Compute loss and prediction score.

Parameters:

pred (torch.tensor) – Unnormalized prediction
true (torch.tensor) – Ground truth labels

Returns: Loss, normalized prediction score

create_model(to_device=True, dim_in=None, dim_out=None) → GraphGymModule[source]

Create model for graph machine learning.

Parameters:

to_device (bool, optional) – Whether to transfer the model to the specified device. (default: True)
dim_in (int, optional) – Input dimension to the model
dim_out (int, optional) – Output dimension to the model

Return type:

GraphGymModule

create_optimizer(params: Iterator[Parameter], cfg: Any) → Any[source]

Creates a config-driven optimizer.

Return type:: Any

create_scheduler(optimizer: Optimizer, cfg: Any) → Any[source]

Creates a config-driven learning rate scheduler.

Return type:: Any

train(model: GraphGymModule, datamodule: GraphGymDataModule, logger: bool = True, trainer_config: Optional[Dict[str, Any]] = None)[source]

Trains a GraphGym model using PyTorch Lightning.

Parameters:

model (GraphGymModule) – The GraphGym model.
datamodule (GraphGymDataModule) – The GraphGym data module.
logger (bool, optional) – Whether to enable logging during training. (default: True)
trainer_config (dict, optional) – Additional trainer configuration.

register_base(mapping: Dict[str, Any], key: str, module: Optional[Any] = None) → Union[None, Callable][source]

Base function for registering a module in GraphGym.

Parameters:

mapping (dict) – Python dictionary to register the module. hosting all the registered modules
key (str) – The name of the module.
module (any, optional) – The module. If set to None, will return a decorator to register a module.

Return type:

Optional[Callable]

register_act(key: str, module: Optional[Any] = None)[source]: Registers an activation function in GraphGym.

register_node_encoder(key: str, module: Optional[Any] = None)[source]: Registers a node feature encoder in GraphGym.

register_edge_encoder(key: str, module: Optional[Any] = None)[source]: Registers an edge feature encoder in GraphGym.

register_stage(key: str, module: Optional[Any] = None)[source]: Registers a customized GNN stage in GraphGym.

register_head(key: str, module: Optional[Any] = None)[source]: Registers a GNN prediction head in GraphGym.

register_layer(key: str, module: Optional[Any] = None)[source]: Registers a GNN layer in GraphGym.

register_pooling(key: str, module: Optional[Any] = None)[source]: Registers a GNN global pooling/readout layer in GraphGym.

register_network(key: str, module: Optional[Any] = None)[source]: Registers a GNN model in GraphGym.

register_config(key: str, module: Optional[Any] = None)[source]: Registers a configuration group in GraphGym.

register_dataset(key: str, module: Optional[Any] = None)[source]: Registers a dataset in GraphGym.

register_loader(key: str, module: Optional[Any] = None)[source]: Registers a data loader in GraphGym.

register_optimizer(key: str, module: Optional[Any] = None)[source]: Registers an optimizer in GraphGym.

register_scheduler(key: str, module: Optional[Any] = None)[source]: Registers a learning rate scheduler in GraphGym.

register_loss(key: str, module: Optional[Any] = None)[source]: Registers a loss function in GraphGym.

register_train(key: str, module: Optional[Any] = None)[source]: Registers a training function in GraphGym.

register_metric(key: str, module: Optional[Any] = None)[source]: Register a metric function in GraphGym.

Model Modules 

`IntegerFeatureEncoder`	Provides an encoder for integer node features.
`AtomEncoder`	The atom encoder used in OGB molecule dataset.
`BondEncoder`	The bond encoder used in OGB molecule dataset.
`GNNLayer`	Creates a GNN layer, given the specified input and output dimensions and the underlying configuration in `cfg`.
`GNNPreMP`	Creates a NN layer used before message passing, given the specified input and output dimensions and the underlying configuration in `cfg`.
`GNNStackStage`	Stacks a number of GNN layers.
`FeatureEncoder`	Encodes node and edge features, given the specified input dimension and the underlying configuration in `cfg`.
`GNN`	A general Graph Neural Network (GNN) model.
`GNNNodeHead`	A GNN prediction head for node-level prediction tasks.
`GNNEdgeHead`	A GNN prediction head for edge-level/link-level prediction tasks.
`GNNGraphHead`	A GNN prediction head for graph-level prediction tasks.
`GeneralLayer`	A general wrapper for layers.
`GeneralMultiLayer`	A general wrapper class for a stacking multiple NN layers.
`Linear`	A basic Linear layer.
`BatchNorm1dNode`	A batch normalization layer for node-level features.
`BatchNorm1dEdge`	A batch normalization layer for edge-level features.
`MLP`	A basic MLP model.
`GCNConv`	A Graph Convolutional Network (GCN) layer.
`SAGEConv`	A GraphSAGE layer.
`GATConv`	A Graph Attention Network (GAT) layer.
`GINConv`	A Graph Isomorphism Network (GIN) layer.
`SplineConv`	A SplineCNN layer.
`GeneralConv`	A general GNN layer.
`GeneralEdgeConv`	A general GNN layer with edge feature support.
`GeneralSampleEdgeConv`	A general GNN layer that supports edge features and edge sampling.
`global_add_pool`	Returns batch-wise graph-level-outputs by adding node features across the node dimension.
`global_mean_pool`	Returns batch-wise graph-level-outputs by averaging node features across the node dimension.
`global_max_pool`	Returns batch-wise graph-level-outputs by taking the channel-wise maximum across the node dimension.

class IntegerFeatureEncoder(emb_dim: int, num_classes: int)[source]

Provides an encoder for integer node features.

Parameters:

emb_dim (int) – The output embedding dimension.
num_classes (int) – The number of classes/integers.

Example

>>> encoder = IntegerFeatureEncoder(emb_dim=16, num_classes=10)
>>> batch = torch.randint(0, 10, (10, 2))
>>> encoder(batch).size()
torch.Size([10, 16])

class AtomEncoder(emb_dim, *args, **kwargs)[source]

The atom encoder used in OGB molecule dataset.

Parameters:: emb_dim (int) – The output embedding dimension.

Example

>>> encoder = AtomEncoder(emb_dim=16)
>>> batch = torch.randint(0, 10, (10, 3))
>>> encoder(batch).size()
torch.Size([10, 16])

class BondEncoder(emb_dim: int)[source]

The bond encoder used in OGB molecule dataset.

Parameters:: emb_dim (int) – The output embedding dimension.

Example

>>> encoder = BondEncoder(emb_dim=16)
>>> batch = torch.randint(0, 10, (10, 3))
>>> encoder(batch).size()
torch.Size([10, 16])

GNNLayer(dim_in: int, dim_out: int, has_act: bool = True) → GeneralLayer[source]

Creates a GNN layer, given the specified input and output dimensions and the underlying configuration in cfg.

Parameters:

dim_in (int) – The input dimension
dim_out (int) – The output dimension.
has_act (bool, optional) – Whether to apply an activation function after the layer. (default: True)

Return type:

GeneralLayer

GNNPreMP(dim_in: int, dim_out: int, num_layers: int) → GeneralMultiLayer[source]

Creates a NN layer used before message passing, given the specified input and output dimensions and the underlying configuration in cfg.

Parameters:

dim_in (int) – The input dimension
dim_out (int) – The output dimension.
num_layers (int) – The number of layers.

Return type:

GeneralMultiLayer

class GNNStackStage(dim_in, dim_out, num_layers)[source]

Stacks a number of GNN layers.

Parameters:

dim_in (int) – The input dimension
dim_out (int) – The output dimension.
num_layers (int) – The number of layers.

class FeatureEncoder(dim_in: int)[source]

Encodes node and edge features, given the specified input dimension and the underlying configuration in cfg.

Parameters:: dim_in (int) – The input feature dimension.

class GNN(dim_in: int, dim_out: int, **kwargs)[source]

A general Graph Neural Network (GNN) model.

The GNN model consists of three main components:

An encoder to transform input features into a fixed-size embedding space.
A processing or message passing stage for information exchange between nodes.
A head to produce the final output features/predictions.

The configuration of each component is determined by the underlying configuration in cfg.

Parameters:

dim_in (int) – The input feature dimension.
dim_out (int) – The output feature dimension.
**kwargs (optional) – Additional keyword arguments.

class GNNNodeHead(dim_in: int, dim_out: int)[source]

A GNN prediction head for node-level prediction tasks.

Parameters:

dim_in (int) – The input feature dimension.
dim_out (int) – The output feature dimension.

class GNNEdgeHead(dim_in: int, dim_out: int)[source]

A GNN prediction head for edge-level/link-level prediction tasks.

Parameters:

dim_in (int) – The input feature dimension.
dim_out (int) – The output feature dimension.

class GNNGraphHead(dim_in: int, dim_out: int)[source]

A GNN prediction head for graph-level prediction tasks. A post message passing layer (as specified by cfg.gnn.post_mp) is used to transform the pooled graph-level embeddings using an MLP.

Parameters:

dim_in (int) – The input feature dimension.
dim_out (int) – The output feature dimension.

class GeneralLayer(name, layer_config: LayerConfig, **kwargs)[source]

A general wrapper for layers.

Parameters:

name (str) – The registered name of the layer.
layer_config (LayerConfig) – The configuration of the layer.
**kwargs (optional) – Additional keyword arguments.

class GeneralMultiLayer(name, layer_config: LayerConfig, **kwargs)[source]

A general wrapper class for a stacking multiple NN layers.

Parameters:

name (str) – The registered name of the layer.
layer_config (LayerConfig) – The configuration of the layer.
**kwargs (optional) – Additional keyword arguments.

class Linear(layer_config: LayerConfig, **kwargs)[source]

A basic Linear layer.

Parameters:

layer_config (LayerConfig) – The configuration of the layer.
**kwargs (optional) – Additional keyword arguments.

class BatchNorm1dNode(layer_config: LayerConfig)[source]

A batch normalization layer for node-level features.

Parameters:: layer_config (LayerConfig) – The configuration of the layer.

class BatchNorm1dEdge(layer_config: LayerConfig)[source]

A batch normalization layer for edge-level features.

Parameters:: layer_config (LayerConfig) – The configuration of the layer.

class MLP(layer_config: LayerConfig, **kwargs)[source]

A basic MLP model.

Parameters:

layer_config (LayerConfig) – The configuration of the layer.
**kwargs (optional) – Additional keyword arguments.

class GCNConv(layer_config: LayerConfig, **kwargs)[source]: A Graph Convolutional Network (GCN) layer.

class SAGEConv(layer_config: LayerConfig, **kwargs)[source]: A GraphSAGE layer.

class GATConv(layer_config: LayerConfig, **kwargs)[source]: A Graph Attention Network (GAT) layer.

class GINConv(layer_config: LayerConfig, **kwargs)[source]: A Graph Isomorphism Network (GIN) layer.

class SplineConv(layer_config: LayerConfig, **kwargs)[source]: A SplineCNN layer.

class GeneralConv(layer_config: LayerConfig, **kwargs)[source]: A general GNN layer.

class GeneralEdgeConv(layer_config: LayerConfig, **kwargs)[source]: A general GNN layer with edge feature support.

class GeneralSampleEdgeConv(layer_config: LayerConfig, **kwargs)[source]: A general GNN layer that supports edge features and edge sampling.

global_add_pool(x: Tensor, batch: Optional[Tensor], size: Optional[int] = None) → Tensor[source]

Returns batch-wise graph-level-outputs by adding node features across the node dimension.

For a single graph \(\mathcal{G}_i\), its output is computed by

\[\mathbf{r}_i = \sum_{n=1}^{N_i} \mathbf{x}_n.\]

Functional method of the SumAggregation module.

Parameters:

x (torch.Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).
batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each node to a specific example.
size (int, optional) – The number of examples \(B\). Automatically calculated if not given. (default: None)

Return type:

Tensor

global_mean_pool(x: Tensor, batch: Optional[Tensor], size: Optional[int] = None) → Tensor[source]

Returns batch-wise graph-level-outputs by averaging node features across the node dimension.

For a single graph \(\mathcal{G}_i\), its output is computed by

\[\mathbf{r}_i = \frac{1}{N_i} \sum_{n=1}^{N_i} \mathbf{x}_n.\]

Functional method of the MeanAggregation module.

Parameters:

x (torch.Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).
batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each node to a specific example.
size (int, optional) – The number of examples \(B\). Automatically calculated if not given. (default: None)

Return type:

Tensor

global_max_pool(x: Tensor, batch: Optional[Tensor], size: Optional[int] = None) → Tensor[source]

Returns batch-wise graph-level-outputs by taking the channel-wise maximum across the node dimension.

For a single graph \(\mathcal{G}_i\), its output is computed by

\[\mathbf{r}_i = \mathrm{max}_{n=1}^{N_i} \, \mathbf{x}_n.\]

Functional method of the MaxAggregation module.

Parameters:

x (torch.Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).
batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each element to a specific example.
size (int, optional) – The number of examples \(B\). Automatically calculated if not given. (default: None)

Return type:

Tensor

Utility Modules 

`agg_runs`	Aggregate over different random seeds of a single experiment.
`agg_batch`	Aggregate across results from multiple experiments via grid search.
`params_count`	Computes the number of parameters.
`match_baseline_cfg`	Match the computational budget of a given baseline model.
`get_current_gpu_usage`	Get the current GPU memory usage.
`auto_select_device`	Auto select device for the current experiment.
`is_eval_epoch`	Determines if the model should be evaluated at the current epoch.
`is_ckpt_epoch`	Determines if the model should be evaluated at the current epoch.
`dict_to_json`	Dump a Python dictionary to a JSON file.
`dict_list_to_json`	Dump a list of Python dictionaries to a JSON file.
`dict_to_tb`	Add a dictionary of statistics to a Tensorboard writer.
`makedirs_rm_exist`	Make a directory, remove any existing data.
`dummy_context`	Default context manager that does nothing.

agg_runs(dir, metric_best='auto')[source]

Aggregate over different random seeds of a single experiment.

Parameters:

dir (str) – Directory of the results, containing 1 experiment
metric_best (str, optional) – The metric for selecting the best
Options (validation performance.) – auto, accuracy, auc.

agg_batch(dir, metric_best='auto')[source]

Aggregate across results from multiple experiments via grid search.

Parameters:

dir (str) – Directory of the results, containing multiple experiments
metric_best (str, optional) – The metric for selecting the best
Options (validation performance.) – auto, accuracy, auc.

params_count(model)[source]

Computes the number of parameters.

Parameters:: model (nn.Module) – PyTorch model

match_baseline_cfg(cfg_dict, cfg_dict_baseline, verbose=True)[source]

Match the computational budget of a given baseline model. The current configuration dictionary will be modifed and returned.

Parameters:

cfg_dict (dict) – Current experiment’s configuration
cfg_dict_baseline (dict) – Baseline configuration
verbose (str, optional) – If printing matched paramter conunts

get_current_gpu_usage()[source]: Get the current GPU memory usage.

auto_select_device()[source]: Auto select device for the current experiment.

is_eval_epoch(cur_epoch)[source]: Determines if the model should be evaluated at the current epoch.

is_ckpt_epoch(cur_epoch)[source]: Determines if the model should be evaluated at the current epoch.

dict_to_json(dict, fname)[source]

Dump a Python dictionary to a JSON file.

Parameters:

dict (dict) – The Python dictionary.
fname (str) – The output file name.

dict_list_to_json(dict_list, fname)[source]

Dump a list of Python dictionaries to a JSON file.

Parameters:

dict_list (list of dict) – List of Python dictionaries.
fname (str) – the output file name.

dict_to_tb(dict, writer, epoch)[source]

Add a dictionary of statistics to a Tensorboard writer.

Parameters:

dict (dict) – Statistics of experiments, the keys are attribute names,
values (the values are the attribute) –
writer – Tensorboard writer object
epoch (int) – The current epoch

makedirs_rm_exist(dir)[source]

Make a directory, remove any existing data.

Parameters:: dir (str) – The directory to be created.

class dummy_context[source]: Default context manager that does nothing.

torch_geometric.graphgym

Workflow and Register Modules

Model Modules

Utility Modules

Workflow and Register Modules 

Model Modules 

Utility Modules 