pyagc.clusters
The pyagc.clusters package provides modular Cluster Head implementations for attributed graph clustering.
Following the Encode-Cluster-Optimize framework, a cluster head transforms latent node embeddings \(\mathbf{Z} \in \mathbb{R}^{N \times H}\) into a cluster assignment matrix \(\mathbf{P} \in \mathbb{R}^{N \times K}\):
All cluster heads inherit from BaseClusterHead, ensuring a consistent interface for clustering, loss computation, and assignment retrieval.
This modular design allows any encoder backbone to be easily paired with different clustering mechanisms — simply swap the cluster head to isolate the effect of the clustering objective:
from pyagc.clusters import DECClusterHead, DMoNClusterHead, KMeansClusterHead
# Differentiable prototypical clustering:
dec_head = DECClusterHead(n_clusters=7, n_features=256)
# Differentiable discriminative clustering:
dmon_head = DMoNClusterHead(n_clusters=7, n_features=256)
# Post-hoc discrete clustering:
kmeans_head = KMeansClusterHead(n_clusters=7, backend="torch")
Based on the differentiability of the cluster projection, which dictates the feasibility of end-to-end optimization, the cluster heads are organized into two categories:
Differentiable Cluster Heads treat clustering as a differentiable layer within a neural network, allowing gradients to flow from the clustering loss back to the encoder. This includes discriminative approaches (e.g.,
DMoNClusterHead,MinCutClusterHead,NeuromapClusterHead) and prototypical approaches (e.g.,DECClusterHead,DinkClusterHead,INCClusterHead).Discrete Cluster Heads apply a non-differentiable clustering algorithm to fixed representations, decoupling clustering from representation learning (e.g.,
KMeansClusterHead).
Base Class
Base class for clustering heads in neural clustering models. |
- class BaseClusterHead[source]
-
Base class for clustering heads in neural clustering models.
- abstract cluster(*args: Any, **kwargs: Any) Tensor[source]
Predicts cluster assignments.
- Returns:
Tensor–:If soft=False,
(n_samples,)tensor of cluster indices.If soft=True,
(n_samples, n_clusters)tensor of probabilities.
Differentiable Cluster Heads
Neural Clustering Layer proposed in the "Unsupervised Deep Embedding for Clustering Analysis" paper (Xie et al., ICML 2016). |
|
Neural Clustering Layer proposed in the "Dink-Net: Neural Clustering on Large Graphs" paper (Liu et al., ICML 2023). |
|
Deep Modularity Network (DMoN) Clustering Head proposed in the "Graph Clustering with Graph Neural Networks" paper (Tsitsulin et al., JMLR 2023). |
|
Neural Clustering Layer proposed in the "XAI Beyond Classification: Interpretable Neural Clustering" paper (Peng et al., JMLR 2022). |
|
MinCut Clustering Head proposed in "Spectral Clustering in Graph Neural Networks for Graph Pooling" (Bianchi et al., ICML 2019). |
|
Neuromap Clustering Head from the paper "The Map Equation Goes Neural: Mapping Network Flows with Graph Neural Networks" paper (Blöcker et al., NeurIPS 2024). |
|
Stochastic Block Model (SBM) Clustering Head from the paper "Differentiable Community Detection with Graph Neural Networks and Stochastic Block Models" (Arliss & Mueller, LoG 2025). |
|
Graph Matching SBM Clustering Head from the paper "Differentiable Community Detection with Graph Neural Networks and Stochastic Block Models" (Arliss & Mueller, LoG 2025). |
- class DECClusterHead(n_clusters: int, n_features: int, alpha: float = 1.0)[source]
Bases:
BaseClusterHeadNeural Clustering Layer proposed in the “Unsupervised Deep Embedding for Clustering Analysis” paper (Xie et al., ICML 2016).
This layer learns cluster centers and computes soft assignment of input samples to clusters using Student’s t-distribution.
Specifically, the probability \(q_{ij}\) that a sample \(i\) belongs to cluster \(j\) is given by:
\[q_{ij} = \frac{(1 + \|z_i - \mu_j\|^2 / \alpha)^{-\frac{\alpha+1}{2}}} {\sum_{j'} (1 + \|z_i - \mu_{j'}\|^2 / \alpha)^{-\frac{\alpha+1}{2}}}\]where \(z_i\) is the embedded point and \(\mu_j\) is the j-th cluster center, and \(\alpha\) is the degrees of freedom of the Student’s t-distribution (default is 1).
The target distribution \(p_{ij}\) is computed as:
\[p_{ij} = \frac{q_{ij}^2 / \sum_i q_{ij}}{\sum_j (q_{ij}^2 / \sum_i q_{ij})}\]The loss is the KL divergence between the soft assignments \(q\) and the target distribution \(p\):
\[L = \text{KL}(P \| Q) = \sum_i \sum_j p_{ij} \log \frac{p_{ij}}{q_{ij}}\]- Parameters:
- reset_cluster_centers(cluster_centers: Optional[Tensor] = None) None[source]
Manually sets the cluster centers.
- Parameters:
cluster_centers (torch.Tensor, optional) – Tensor of shape
(n_clusters, n_features)to initialize the cluster centers. If None, use Xavier uniform initialization.- Return type:
- forward(z: Tensor, update_target: bool = True) Tensor[source]
Computes the KL divergence loss between the soft assignments and the target distribution.
- Parameters:
z (torch.Tensor) – Input tensor of shape
(n_samples, n_features).update_target (bool, optional) – Whether to recompute the target distribution P. If
False, uses the cached distribution. This is useful for maintaining training stability by updating the target less frequently. (default:True)
- Returns:
Tensor– Scalar loss tensor.
- cluster(z: Tensor, soft: bool = False) Tensor[source]
Predicts cluster assignments.
- Parameters:
z (torch.Tensor) – Input tensor of shape
(n_samples, n_features).soft (bool, optional) – If True, returns the soft assignment matrix; if False, returns hard cluster assignments. (default:
False)
- Returns:
Tensor–:If
softis False,(n_samples,)tensor of cluster indices.If
softis True,(n_samples, n_clusters)tensor of probabilities.
- class DinkClusterHead(n_clusters: int, n_features: int)[source]
Bases:
BaseClusterHeadNeural Clustering Layer proposed in the “Dink-Net: Neural Clustering on Large Graphs” paper (Liu et al., ICML 2023).
This layer models cluster centers as learnable parameters and optimizes clustering assignments via cluster dilation and shrink losses:
\[\mathcal{L}_\text{dilation} = \frac{-1}{(K-1)K} \sum_{i=1}^{K} \sum_{j=1, j\neq i}^{K} \| \mathbf{C}_i - \mathbf{C}_j \|_2^2\]\[\mathcal{L}_\text{shrink} = \frac{1}{BK} \sum_{i=1}^{B} \sum_{j=1}^{K} \| \mathbf{Z}_i - \mathbf{C}_j \|_2^2\]where \(\mathbf{C}_i\) is the i-th cluster center, \(\mathbf{Z}_i\) is the i-th embedding in a batch, \(B\) is the batch size, and \(K\) is the number of clusters. The cluster dilation loss pushes cluster centers away from each other to expand the clustering space, while the cluster shrink loss pulls node embeddings toward all cluster centers to avoid confirmation bias.
- Parameters:
- reset_cluster_centers(cluster_centers: Optional[Tensor] = None) None[source]
Manually sets the cluster centers.
- Parameters:
cluster_centers (torch.Tensor, optional) – Tensor of shape
(n_clusters, n_features)to initialize the cluster centers. If None, use Xavier uniform initialization.- Return type:
- forward(z: Tensor) Tuple[Tensor, Tensor][source]
Compute the clustering loss.
- Parameters:
z (torch.Tensor) – Input tensor of shape
(n_samples, n_features).- Returns:
dilation_loss and shrink_loss
- Return type:
Tuple[torch.Tensor, torch.Tensor]
- cluster(z: Tensor, soft: bool = False) Tensor[source]
Predicts cluster assignments.
- Parameters:
z (torch.Tensor) – Input tensor of shape
(n_samples, n_features).soft (bool, optional) – If True, returns the soft assignment matrix; if False, returns hard cluster assignments. (default:
False)
- Returns:
Tensor–:If
softis False,(n_samples,)tensor of cluster indices.If
softis True,(n_samples, n_clusters)tensor of probabilities.
- class DMoNClusterHead(n_clusters: int, n_features: int)[source]
Bases:
BaseClusterHeadDeep Modularity Network (DMoN) Clustering Head proposed in the “Graph Clustering with Graph Neural Networks” paper (Tsitsulin et al., JMLR 2023).
This layer learns a soft cluster assignment matrix \(\mathbf{S}\) by projecting node embeddings \(\mathbf{Z}\) into \(K\) clusters using a linear transformation followed by a softmax. It optimizes the clustering structure with two objectives:
(1) Spectral modularity loss:
\[\mathcal{L}_s = - \frac{1}{2m} \mathrm{Tr}(\mathbf{S}^\top \mathbf{B} \mathbf{S})\]where \(\mathbf{B} = \mathbf{A} - \frac{\mathbf{d}\mathbf{d}^\top}{2m}\) is the modularity matrix, and \(m\) is the total number of edges.
(2) Collapse regularization loss:
\[\mathcal{L}_c = \frac{\sqrt{K}}{N} \left\| \sum_i \mathbf{S}_i^\top \right\|_F - 1\]which prevents unbalanced cluster sizes.
- Parameters:
- reset_cluster_centers(cluster_centers: Optional[Tensor] = None) None[source]
Manually sets the cluster centers.
- Parameters:
cluster_centers (torch.Tensor, optional) – Tensor of shape
(n_clusters, n_features)to initialize the cluster centers. If None, use Xavier uniform initialization.- Return type:
- forward(z: Tensor, edge_index: Tensor) Tuple[Tensor, Tensor][source]
Computes DMoN clustering objectives using node embeddings and graph structure.
- Parameters:
z (torch.Tensor) – Node embeddings of shape
(N, F).edge_index (torch.Tensor) – Edge indices of shape
(2, E).
- Returns:
modularity_loss and collapse_loss
- Return type:
Tuple[torch.Tensor, torch.Tensor]
- cluster(z: Tensor, soft: bool = False) Tensor[source]
Predicts cluster assignments.
- Parameters:
z (torch.Tensor) – Input tensor of shape
(n_samples, n_features).soft (bool, optional) – If True, returns the soft assignment matrix; if False, returns hard cluster assignments. (default:
False)
- Returns:
Tensor–:If
softis False,(n_samples,)tensor of cluster indices.If
softis True,(n_samples, n_clusters)tensor of probabilities.
- class INCClusterHead(n_clusters: int, n_features: int, alpha: float = 0.001)[source]
Bases:
BaseClusterHeadNeural Clustering Layer proposed in the “XAI Beyond Classification: Interpretable Neural Clustering” paper (Peng et al., JMLR 2022).
This layer implements a differentiable k-means reformulation, where cluster centers are learnable parameters.
The cluster centers \(\mathbf{\Omega}_j\) and weights \(\mathbf{W}_j\) are related via:
\[\mathbf{W}_j = 2 \alpha \mathbf{\Omega}_j\]The loss is defined as:
\[L = \frac{1}{\alpha} \sum_{i} (2\alpha - \mathbf{W}_{c(i)}^T \mathbf{z}_i)\]where \(c(i)\) is the nearest cluster to sample \(i\).
- Parameters:
- reset_cluster_centers(cluster_centers: Optional[Tensor] = None) None[source]
Manually sets the cluster centers.
- Parameters:
cluster_centers (torch.Tensor, optional) – Tensor of shape (n_clusters, n_features) to initialize the cluster centers. If None, use Xavier uniform initialization.
- Return type:
- normalize_cluster_centers() None[source]
Normalize cluster centers and apply alpha scaling.
- Return type:
- compute_cluster_center() Tensor[source]
Compute the actual cluster centers (scaled by alpha).
- Returns:
Tensor– Scaled cluster centers.
- forward(z: Tensor) Tensor[source]
Compute the clustering loss.
- Parameters:
z (torch.Tensor) – Input tensor of shape
(n_samples, n_features).- Returns:
Tensor– Scalar loss tensor.
- cluster(z: Tensor, soft: bool = False) Tensor[source]
Predicts cluster assignments.
- Parameters:
z (torch.Tensor) – Input tensor of shape
(n_samples, n_features).soft (bool, optional) – If True, returns the soft assignment matrix; if False, returns hard cluster assignments. (default:
False)
- Returns:
Tensor–:If
softis False,(n_samples,)tensor of cluster indices.If
softis True,(n_samples, n_clusters)tensor of probabilities.
- class MinCutClusterHead(n_clusters: int, n_features: int, temperature: float = 1.0)[source]
Bases:
BaseClusterHeadMinCut Clustering Head proposed in “Spectral Clustering in Graph Neural Networks for Graph Pooling” (Bianchi et al., ICML 2019).
This layer learns a soft cluster assignment matrix \(\mathbf{S}\) by projecting node embeddings \(\mathbf{Z}\) into \(K\) clusters. It jointly optimizes two objectives:
(1) MinCut loss:
\[\mathcal{L}_{\text{mincut}} = - \frac{\mathrm{Tr}(\mathbf{S}^\top \mathbf{A} \mathbf{S})} {\mathrm{Tr}(\mathbf{S}^\top \mathbf{D} \mathbf{S})}\]where \(\mathbf{D}\) is the degree matrix.
(2) Orthogonality loss:
\[\mathcal{L}_{\text{ortho}} = \left\| \frac{\mathbf{S}^\top \mathbf{S}}{\|\mathbf{S}^\top \mathbf{S}\|_F} - \frac{\mathbf{I}_K}{\sqrt{K}} \right\|_F\]which encourages near-orthogonal cluster assignments.
- Parameters:
- reset_cluster_centers(cluster_centers: Optional[Tensor] = None) None[source]
Manually sets the cluster centers.
- Parameters:
cluster_centers (torch.Tensor, optional) – Tensor of shape
(n_clusters, n_features)to initialize the cluster centers. If None, use Xavier uniform initialization.- Return type:
- forward(z: Tensor, edge_index: Tensor) Tuple[Tensor, Tensor][source]
Compute MinCut and Orthogonality losses given node embeddings and graph structure.
- Parameters:
z (torch.Tensor) – Node embeddings \((N, F)\).
edge_index (torch.Tensor) – Edge indices \((2, E)\).
- Returns:
mincut_loss and ortho_loss
- Return type:
Tuple[torch.Tensor, torch.Tensor]
- class NeuromapClusterHead(n_clusters: int, n_features: int, alpha: float = 0.15, n_iters: int = 100)[source]
Bases:
BaseClusterHeadNeuromap Clustering Head from the paper “The Map Equation Goes Neural: Mapping Network Flows with Graph Neural Networks” paper (Blöcker et al., NeurIPS 2024).
This module implements a differentiable version of the map equation for end-to-end optimization with (graph) neural networks.
It learns soft cluster assignments \(\mathbf{S}\) via a linear projection from node embeddings \(\mathbf{Z}\), and computes the Neuromap loss (expected per-step description length) following the Minimum Description Length principle:
\[\mathcal{L}(A, S) = q \log q - \sum_m q_m \log q_m - \sum_m m_{\text{exit}} \log m_{\text{exit}} - \sum_u p_u \log p_u + \sum_m p_m \log p_m\]where all quantities are computed from the soft cluster assignment matrix.
- Parameters:
- reset_cluster_centers(cluster_centers: Optional[Tensor] = None) None[source]
Manually sets the cluster centers.
- Parameters:
cluster_centers (torch.Tensor, optional) – Tensor of shape
(n_clusters, n_features)to initialize the cluster centers. If None, use Xavier uniform initialization.- Return type:
- build_flow(edge_index: Tensor, N: int)[source]
Construct sparse flow matrix F and stationary distribution p.
- forward0(z: Tensor, edge_index: Tensor) Tensor[source]
Compute the Neuromap (Map Equation) loss given node embeddings and adjacency.
- Parameters:
z (torch.Tensor) – Node embeddings of shape
(N, F).edge_index (torch.Tensor) – Edge indices of shape
(2, E).
- Returns:
Map equation loss (codelength)
- Return type:
- forward(z: Tensor, edge_index: Tensor) Tuple[Tensor, Tensor][source]
Compute the Neuromap (Map Equation) loss given node embeddings and adjacency.
- Parameters:
z (torch.Tensor) – Node embeddings of shape
(N, F).edge_index (torch.Tensor) – Edge indices of shape
(2, E).
- Returns:
Map equation loss (codelength) and collapse_loss
- Return type:
Tuple[torch.Tensor, torch.Tensor]
- cluster(z: Tensor, soft: bool = False) Tensor[source]
Predicts cluster assignments.
- Parameters:
z (torch.Tensor) – Input tensor of shape
(n_samples, n_features).soft (bool, optional) – If True, returns the soft assignment matrix; if False, returns hard cluster assignments. (default:
False)
- Returns:
Tensor–:If
softis False,(n_samples,)tensor of cluster indices.If
softis True,(n_samples, n_clusters)tensor of probabilities.
- class SBMClusterHead(n_clusters: int, n_features: int, variant: str = 'bernoulli', eta: float = 3.0)[source]
Bases:
BaseClusterHeadStochastic Block Model (SBM) Clustering Head from the paper “Differentiable Community Detection with Graph Neural Networks and Stochastic Block Models” (Arliss & Mueller, LoG 2025).
This head learns cluster assignments by maximizing the likelihood of an SBM-based generative model. It supports both Bernoulli and Poisson variants, with optional degree correction.
The cluster assignment matrix \(\mathbf{P} \in [0,1]^{N \times K}\) is obtained via softmax transformation of the similarity between node embeddings and learnable cluster centers, and the structure matrix \(\mathbf{\Theta} \in \mathbb{R}^{K \times K}\) is estimated via MLE as:
\[\hat{\Theta}_{ij} = \frac{M_{ij}}{n_i n_j}\]where \(M_{ij}\) is the number of edges between communities \(i\) and \(j\), and \(n_i\) is the number of nodes in community \(i\).
Loss Functions:
(1) Bernoulli SBM:
\[\mathcal{L}_B = -\sum_{(u,v) \in E} \ln(\pi_{uv}) - \eta^{-1} \sum_{(u,v) \notin E} \ln(1 - \pi_{uv})\]where \(\pi_{uv} = \mathbf{P}_u^T \hat{\Theta} \mathbf{P}_v\).
(2) Poisson SBM:
\[\mathcal{L}_P = -\sum_{(u,v) \in E} [\ln(\pi_{uv}) - \pi_{uv}] + \eta^{-1} \sum_{(u,v) \notin E} \pi_{uv}\](3) Degree-Corrected variants:
For degree correction, the expected value becomes \(\phi_u \phi_v \mathbf{P}_u^T \hat{\Theta} \mathbf{P}_v\), where:
\[\hat{\phi}_u = (\mathbf{P}_u^T \mathbf{n}) \frac{d_u}{\mathbf{P}_u^T \boldsymbol{\delta}}\]with \(\boldsymbol{\delta}\) being the sum of degrees in each community.
- Parameters:
n_clusters (int) – Number of clusters.
n_features (int) – Feature dimension of input node embeddings.
variant (str, optional) – SBM variant to use. Options:
'bernoulli','poisson','bernoulli-dc','poisson-dc'. (default:'bernoulli')eta (float, optional) – Negative sampling ratio (number of negative samples per positive edge). (default:
3.0)
- reset_cluster_centers(cluster_centers: Optional[Tensor] = None) None[source]
Manually sets the cluster centers.
- Parameters:
cluster_centers (torch.Tensor, optional) – Tensor of shape
(n_clusters, n_features)to initialize the cluster centers. If None, use Xavier uniform initialization.- Return type:
- forward(z: Tensor, edge_index: Tensor, num_neg_samples: Optional[int] = None) Tuple[Tensor, Tensor][source]
Computes the SBM loss.
- Parameters:
z (torch.Tensor) – Node embeddings of shape
(N, F).edge_index (torch.Tensor) – Edge indices of shape
(2, E).num_neg_samples (int, optional) – Number of negative samples. If None, uses
eta * num_edges. (default:None)
- Returns:
Tuple[Tensor,Tensor] – Tuple of (likelihood_loss, regularization_loss).
- class SBMMatchClusterHead(n_clusters: int, n_features: int)[source]
Bases:
BaseClusterHeadGraph Matching SBM Clustering Head from the paper “Differentiable Community Detection with Graph Neural Networks and Stochastic Block Models” (Arliss & Mueller, LoG 2025).
This variant uses the Graph Matching objective, which aligns the graph with its community representation by minimizing:
\[\mathcal{L}_{Match} = -\mathrm{tr}(\mathbf{A}^T \mathbf{P} \hat{\Theta} \mathbf{P}^T)\]This approach exploits matrix sparsity and is significantly faster than edge sampling methods.
- Parameters:
- reset_cluster_centers(cluster_centers: Optional[Tensor] = None) None[source]
Manually sets the cluster centers.
- Parameters:
cluster_centers (torch.Tensor, optional) – Tensor of shape
(n_clusters, n_features)to initialize the cluster centers. If None, use Xavier uniform initialization.- Return type:
- forward(z: Tensor, edge_index: Tensor) Tuple[Tensor, Tensor][source]
Computes the Graph Matching loss.
- Parameters:
z (torch.Tensor) – Node embeddings of shape
(N, F).edge_index (torch.Tensor) – Edge indices of shape
(2, E).
- Returns:
Tuple[Tensor,Tensor] – Tuple of (likelihood_loss, regularization_loss).
Discrete Cluster Heads
The K-Means clustering head with fixed cluster centers. |
- class KMeansClusterHead(n_clusters: int, backend: str = 'torch', n_init: int = 10, max_iter: int = 300, random_state: Optional[int] = None)[source]
Bases:
BaseClusterHeadThe K-Means clustering head with fixed cluster centers.
This module performs clustering using the
TorchKMeansorsklearn.cluster.KMeansalgorithm, and stores the resulting cluster centers for inference. Once fitted, thecluster()method can be used to assign new points based on the stored centers.Note
This class does not learn trainable parameters and does not define a clustering loss. It is typically used for post-hoc or plug-in clustering.
- Parameters:
n_clusters (int) – Number of clusters.
backend (str, optional) – The backend to use for K-Means, either
"torch"or"triton"or"sklearn". (default:"torch")n_init (int, optional) – Number of K-Means initializations to run. (default:
10)max_iter (int, optional) – Maximum number of iterations per K-Means run. (default:
300)
- fit_predict(z: Tensor) Tensor[source]
Performs k-means clustering on the input data and returns cluster labels.
- Parameters:
z (torch.Tensor) – The input data of shape
(n_samples, n_features).- Returns:
Tensor– Cluster assignments of shape(n_samples,).
- cluster(z: Tensor, soft: bool = False) Tensor[source]
Assigns samples to clusters based on fixed cluster centers.
This function computes the squared Euclidean distance to each center and returns either hard assignments or soft probabilities.
- Parameters:
z (torch.Tensor) – Input tensor of shape
(n_samples, n_features).soft (bool, optional) – If True, returns the soft assignment matrix; if False, returns hard cluster assignments. (default:
False)
- Returns:
Tensor–:If
softis False,(n_samples,)tensor of cluster indices.If
softis True,(n_samples, n_clusters)tensor of probabilities.
GPU-Accelerated KMeans
To overcome the computational bottleneck of CPU-based KMeans when \(N > 10^6\), pyagc.clusters provides GPU-accelerated KMeans implementations using PyTorch and OpenAI Triton, achieving multi-fold speedups over CPU baselines:
from pyagc.clusters import TorchKMeans, TritonKMeans
# PyTorch-based GPU KMeans:
kmeans = TorchKMeans(n_clusters=7, metric="euclidean", max_iter=300)
kmeans.fit(embeddings)
labels = kmeans.labels_
# Triton-accelerated GPU KMeans (requires CUDA):
kmeans = TritonKMeans(n_clusters=7, metric="euclidean", max_iter=300)
kmeans.fit(embeddings)
labels = kmeans.labels_
A PyTorch-based KMeans clustering implementation supporting both Euclidean and Cosine distance metrics, with optional distributed training. |
|
Fast K-Means clustering implemented with Triton GPU kernels. |
- class TorchKMeans(metric: str = 'euclidean', init: Union[str, Tensor] = 'k-means++', random_state: Optional[int] = None, n_clusters: int = 8, n_init: int = 10, max_iter: int = 300, tol: float = 0.0001, distributed: bool = False, verbose: bool = False)[source]
Bases:
objectA PyTorch-based KMeans clustering implementation supporting both Euclidean and Cosine distance metrics, with optional distributed training. This implementation is adapted from: Hzzone/torch_clustering.
- Parameters:
metric (str, optional) – Distance metric to use:
'euclidean'or'cosine'. (default:'euclidean')init (str or torch.Tensor, optional) – Method for initialization:
'k-means++','random'or user-specified tensor of shape(n_clusters, n_features). (default:'k-means++')random_state (int, optional) – Random seed for initialization. (default:
None)n_clusters (int, optional) – Number of clusters. (default:
8)n_init (int, optional) – Number of times the algorithm will be run with different centroid seeds. (default:
10)max_iter (int, optional) – Maximum number of iterations of the k-means algorithm for a single run. (default:
300)tol (float, optional) – Relative tolerance with regards to inertia to declare convergence. (default:
1e-4)distributed (bool, optional) – Whether to use distributed training. (default:
False)verbose (bool, optional) – Whether to print progress information. (default:
False)
- initialize(X: Tensor, random_state: int) Tensor[source]
Initializes the cluster centers.
- Parameters:
X (torch.Tensor) – The input data of shape
(n_samples, n_features).random_state (int) – The random seed.
- Returns:
Tensor– Initialized cluster centers of shape(n_clusters, n_features).
- fit_predict(X: Tensor) Tensor[source]
Performs k-means clustering on the input data and returns cluster labels.
- Parameters:
X (torch.Tensor) – The input data of shape
(n_samples, n_features).- Returns:
Tensor– Cluster assignments of shape(n_samples,).
- predict(X: Tensor, soft: bool = False) Tensor[source]
Assigns samples to clusters based on fixed cluster centers.
This function computes the squared Euclidean distance to each center and returns either hard assignments or soft probabilities.
- Parameters:
X (torch.Tensor) – Input tensor of shape
(n_samples, n_features).soft (bool, optional) – If True, returns the soft assignment matrix; if False, returns hard cluster assignments. (default:
False)
- Returns:
Tensor–:If
softis False,(n_samples,)tensor of cluster indices.If
softis True,(n_samples, n_clusters)tensor of probabilities.
- class TritonKMeans(*args, **kwargs)[source]
Bases:
objectFast K-Means clustering implemented with Triton GPU kernels.
This implementation provides an interface compatible with TorchKMeans while leveraging Triton kernels for improved performance.
- Parameters:
metric (str, default='euclidean') – Distance metric to use. Options: ‘euclidean’, ‘cosine’, ‘dot’
init (str or torch.Tensor, default='k-means++') – Method for initialization: ‘k-means++’, ‘random’ or user-specified tensor of shape (n_clusters, n_features).
random_state (int, optional) – Random seed for centroid initialization.
n_clusters (int, default=8) – Number of clusters (k).
n_init (int, default=10) – Number of times the algorithm will be run with different centroid seeds. The final result will be the best output of n_init consecutive runs.
max_iter (int, default=300) – Maximum number of iterations for a single run.
tol (float, default=1e-4) – Relative tolerance with regards to inertia to declare convergence.
verbose (bool, default=False) – Whether to print per-iteration info.
dtype (torch.dtype, optional) – Compute data type for algorithm.
device (torch.device, optional) – Target device. Defaults to “cuda:0” when available. Currently, only CUDA devices are supported.
distributed (bool, default=False) – Reserved for future distributed training support (currently not implemented).