pyagc.clusters.TorchKMeans
- class TorchKMeans(metric: str = 'euclidean', init: Union[str, Tensor] = 'k-means++', random_state: Optional[int] = None, n_clusters: int = 8, n_init: int = 10, max_iter: int = 300, tol: float = 0.0001, distributed: bool = False, verbose: bool = False)[source]
Bases:
objectA PyTorch-based KMeans clustering implementation supporting both Euclidean and Cosine distance metrics, with optional distributed training. This implementation is adapted from: Hzzone/torch_clustering.
- Parameters:
metric (str, optional) – Distance metric to use:
'euclidean'or'cosine'. (default:'euclidean')init (str or torch.Tensor, optional) – Method for initialization:
'k-means++','random'or user-specified tensor of shape(n_clusters, n_features). (default:'k-means++')random_state (int, optional) – Random seed for initialization. (default:
None)n_clusters (int, optional) – Number of clusters. (default:
8)n_init (int, optional) – Number of times the algorithm will be run with different centroid seeds. (default:
10)max_iter (int, optional) – Maximum number of iterations of the k-means algorithm for a single run. (default:
300)tol (float, optional) – Relative tolerance with regards to inertia to declare convergence. (default:
1e-4)distributed (bool, optional) – Whether to use distributed training. (default:
False)verbose (bool, optional) – Whether to print progress information. (default:
False)
- __init__(metric: str = 'euclidean', init: Union[str, Tensor] = 'k-means++', random_state: Optional[int] = None, n_clusters: int = 8, n_init: int = 10, max_iter: int = 300, tol: float = 0.0001, distributed: bool = False, verbose: bool = False)[source]
Methods
__init__([metric, init, random_state, ...])fit_predict(X)Performs k-means clustering on the input data and returns cluster labels.
initialize(X, random_state)Initializes the cluster centers.
predict(X[, soft])Assigns samples to clusters based on fixed cluster centers.
- initialize(X: Tensor, random_state: int) Tensor[source]
Initializes the cluster centers.
- Parameters:
X (torch.Tensor) – The input data of shape
(n_samples, n_features).random_state (int) – The random seed.
- Returns:
Tensor– Initialized cluster centers of shape(n_clusters, n_features).
- fit_predict(X: Tensor) Tensor[source]
Performs k-means clustering on the input data and returns cluster labels.
- Parameters:
X (torch.Tensor) – The input data of shape
(n_samples, n_features).- Returns:
Tensor– Cluster assignments of shape(n_samples,).
- predict(X: Tensor, soft: bool = False) Tensor[source]
Assigns samples to clusters based on fixed cluster centers.
This function computes the squared Euclidean distance to each center and returns either hard assignments or soft probabilities.
- Parameters:
X (torch.Tensor) – Input tensor of shape
(n_samples, n_features).soft (bool, optional) – If True, returns the soft assignment matrix; if False, returns hard cluster assignments. (default:
False)
- Returns:
Tensor–:If
softis False,(n_samples,)tensor of cluster indices.If
softis True,(n_samples, n_clusters)tensor of probabilities.