pyagc.encoders.TunedGNN

class TunedGNN(in_channels: int, hidden_channels: int, num_layers: int, out_channels: Optional[int] = None, dropout: float = 0.0, act: Optional[Union[str, Callable]] = 'relu', act_first: bool = False, act_last: bool = False, act_kwargs: Optional[Dict[str, Any]] = None, norm: Optional[Union[str, Callable]] = None, norm_kwargs: Optional[Dict[str, Any]] = None, residual: bool = False, pre_linear: bool = False, jk: Optional[str] = None, **kwargs)[source]

Bases: Module

An enhanced GNN model with tuned hyperparameters based on “Classic GNNs are Strong Baselines: Reassessing GNNs for Node Classification” paper (Luo et al., NeurIPS 2024).

This implementation incorporates critical improvements identified in the paper: - Residual connections for deeper networks and heterophilous graphs - Pre-linear transformation option - Flexible normalization (LayerNorm/BatchNorm) - Optimized dropout strategies - Support for deeper architectures (up to 10-15 layers)

Parameters:
  • in_channels (int or tuple) – Size of each input sample, or -1 to derive the size from the first input(s) to the forward method. A tuple corresponds to the sizes of source and target dimensionalities.

  • hidden_channels (int) – Size of each hidden sample.

  • num_layers (int) – Number of message passing layers.

  • out_channels (int, optional) – If not set to None, will apply a final linear transformation to convert hidden node embeddings to output size out_channels. (default: None)

  • dropout (float, optional) – Dropout probability. (default: 0.)

  • act (str or Callable, optional) – The non-linear activation function to use. (default: "relu")

  • act_first (bool, optional) – If set to True, activation is applied before normalization. (default: False)

  • act_last (bool, optional) – If set to True, applies activation function to the final output. (default: False)

  • act_kwargs (Dict[str, Any], optional) – Arguments passed to the respective activation function defined by act. (default: None)

  • norm (str or Callable, optional) – The normalization function. Recommended: "batch_norm" for large graphs, "layer_norm" for smaller graphs. (default: None)

  • norm_kwargs (Dict[str, Any], optional) – Arguments passed to the respective normalization function defined by norm. (default: None)

  • residual (bool, optional) – If set to True, applies residual connections. Especially beneficial for heterophilous graphs. (default: False)

  • pre_linear (bool, optional) – If set to True, applies a linear transformation before the first GNN layer. (default: False)

  • jk (str, optional) – The Jumping Knowledge mode. If specified, the model will additionally apply a final linear transformation to transform node embeddings to the expected output feature dimensionality. (None, "last", "cat", "max", "lstm"). (default: None)

  • **kwargs (optional) – Additional arguments of the underlying torch_geometric.nn.conv.MessagePassing layers.

__init__(in_channels: int, hidden_channels: int, num_layers: int, out_channels: Optional[int] = None, dropout: float = 0.0, act: Optional[Union[str, Callable]] = 'relu', act_first: bool = False, act_last: bool = False, act_kwargs: Optional[Dict[str, Any]] = None, norm: Optional[Union[str, Callable]] = None, norm_kwargs: Optional[Dict[str, Any]] = None, residual: bool = False, pre_linear: bool = False, jk: Optional[str] = None, **kwargs)[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

Methods

__init__(in_channels, hidden_channels, ...)

Initialize internal Module state, shared by both nn.Module and ScriptModule.

add_module(name, module)

Add a child module to the current module.

apply(fn)

Apply fn recursively to every submodule (as returned by .children()) as well as self.

bfloat16()

Casts all floating point parameters and buffers to bfloat16 datatype.

buffers([recurse])

Return an iterator over module buffers.

children()

Return an iterator over immediate children modules.

compile(*args, **kwargs)

Compile this Module's forward using torch.compile().

cpu()

Move all model parameters and buffers to the CPU.

cuda([device])

Move all model parameters and buffers to the GPU.

double()

Casts all floating point parameters and buffers to double datatype.

eval()

Set the module in evaluation mode.

extra_repr()

Return the extra representation of the module.

float()

Casts all floating point parameters and buffers to float datatype.

forward(x, edge_index[, edge_weight, ...])

Forward pass.

get_buffer(target)

Return the buffer given by target if it exists, otherwise throw an error.

get_extra_state()

Return any extra state to include in the module's state_dict.

get_parameter(target)

Return the parameter given by target if it exists, otherwise throw an error.

get_submodule(target)

Return the submodule given by target if it exists, otherwise throw an error.

half()

Casts all floating point parameters and buffers to half datatype.

inference(loader[, device, ...])

Performs layer-wise inference on large-graphs using a NeighborLoader, where NeighborLoader should sample the full neighborhood for only one layer.

inference_per_layer(layer, x, edge_index, ...)

Inference for a single layer.

init_conv(in_channels, out_channels, **kwargs)

rtype:

MessagePassing

ipu([device])

Move all model parameters and buffers to the IPU.

load_state_dict(state_dict[, strict, assign])

Copy parameters and buffers from state_dict into this module and its descendants.

modules()

Return an iterator over all modules in the network.

mtia([device])

Move all model parameters and buffers to the MTIA.

named_buffers([prefix, recurse, ...])

Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

named_children()

Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

named_modules([memo, prefix, remove_duplicate])

Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

named_parameters([prefix, recurse, ...])

Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

parameters([recurse])

Return an iterator over module parameters.

register_backward_hook(hook)

Register a backward hook on the module.

register_buffer(name, tensor[, persistent])

Add a buffer to the module.

register_forward_hook(hook, *[, prepend, ...])

Register a forward hook on the module.

register_forward_pre_hook(hook, *[, ...])

Register a forward pre-hook on the module.

register_full_backward_hook(hook[, prepend])

Register a backward hook on the module.

register_full_backward_pre_hook(hook[, prepend])

Register a backward pre-hook on the module.

register_load_state_dict_post_hook(hook)

Register a post-hook to be run after module's load_state_dict() is called.

register_load_state_dict_pre_hook(hook)

Register a pre-hook to be run before module's load_state_dict() is called.

register_module(name, module)

Alias for add_module().

register_parameter(name, param)

Add a parameter to the module.

register_state_dict_post_hook(hook)

Register a post-hook for the state_dict() method.

register_state_dict_pre_hook(hook)

Register a pre-hook for the state_dict() method.

requires_grad_([requires_grad])

Change if autograd should record operations on parameters in this module.

reset_parameters()

Resets all learnable parameters of the module.

set_extra_state(state)

Set extra state contained in the loaded state_dict.

set_submodule(target, module[, strict])

Set the submodule given by target if it exists, otherwise throw an error.

share_memory()

See torch.Tensor.share_memory_().

state_dict(*args[, destination, prefix, ...])

Return a dictionary containing references to the whole state of the module.

to(*args, **kwargs)

Move and/or cast the parameters and buffers.

to_empty(*, device[, recurse])

Move the parameters and buffers to the specified device without copying storage.

train([mode])

Set the module in training mode.

type(dst_type)

Casts all parameters and buffers to dst_type.

xpu([device])

Move all model parameters and buffers to the XPU.

zero_grad([set_to_none])

Reset gradients of all model parameters.

Attributes

T_destination

alias of TypeVar('T_destination', bound=dict[str, Any])

call_super_init

dump_patches

supports_edge_weight

supports_edge_attr

supports_norm_batch

supports_edge_weight: Final[bool]
supports_edge_attr: Final[bool]
supports_norm_batch: Final[bool]
init_conv(in_channels: Union[int, Tuple[int, int]], out_channels: int, **kwargs) MessagePassing[source]
Return type:

MessagePassing

reset_parameters()[source]

Resets all learnable parameters of the module.

forward(x: Tensor, edge_index: Union[Tensor, SparseTensor], edge_weight: Optional[Tensor] = None, edge_attr: Optional[Tensor] = None, batch: Optional[Tensor] = None, batch_size: Optional[int] = None, num_sampled_nodes_per_hop: Optional[List[int]] = None, num_sampled_edges_per_hop: Optional[List[int]] = None) Tensor[source]

Forward pass.

Parameters:
  • x (torch.Tensor) – The input node features.

  • edge_index (torch.Tensor or SparseTensor) – The edge indices.

  • edge_weight (torch.Tensor, optional) – The edge weights (if supported by the underlying GNN layer). (default: None)

  • edge_attr (torch.Tensor, optional) – The edge features (if supported by the underlying GNN layer). (default: None)

  • batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each element to a specific example. Only needs to be passed in case the underlying normalization layers require the batch information. (default: None)

  • batch_size (int, optional) – The number of examples \(B\). Automatically calculated if not given. Only needs to be passed in case the underlying normalization layers require the batch information. (default: None)

  • num_sampled_nodes_per_hop (List[int], optional) – The number of sampled nodes per hop. Useful in NeighborLoader scenarios to only operate on minimal-sized representations. (default: None)

  • num_sampled_edges_per_hop (List[int], optional) – The number of sampled edges per hop. Useful in NeighborLoader scenarios to only operate on minimal-sized representations. (default: None)

Return type:

Tensor

inference_per_layer(layer: int, x: Tensor, edge_index: Union[Tensor, SparseTensor], batch_size: int) Tensor[source]

Inference for a single layer.

Return type:

Tensor

inference(loader: NeighborLoader, device: Optional[Union[device, str]] = None, embedding_device: Union[str, device] = 'cpu', progress_bar: bool = False, cache: bool = False) Tensor[source]

Performs layer-wise inference on large-graphs using a NeighborLoader, where NeighborLoader should sample the full neighborhood for only one layer. This is an efficient way to compute the output embeddings for all nodes in the graph. Only applicable in case jk=None or jk=’last’.

Parameters:
  • loader (torch_geometric.loader.NeighborLoader) – A neighbor loader object that generates full 1-hop subgraphs, i.e., loader.num_neighbors = [-1].

  • device (torch.device, optional) – The device to run the GNN on. (default: None)

  • embedding_device (torch.device, optional) – The device to store intermediate embeddings on. If intermediate embeddings fit on GPU, this option helps to avoid unnecessary device transfers. (default: "cpu")

  • progress_bar (bool, optional) – If set to True, will print a progress bar during computation. (default: False)

  • cache (bool, optional) – If set to True, caches intermediate sampler outputs for usage in later epochs. This will avoid repeated sampling to accelerate inference. (default: False)

Return type:

Tensor