Gaussianmixture
- class geoanalytics.clustering.Gaussianmixture.Gaussianmixture(dataframe)[source]
Bases:
objectAbout this algorithm
- Description:
Gaussian Mixture Model (GMM) is a probabilistic clustering algorithm that clusters feature-rich data by modeling it as a mixture of Gaussians, with runtime and memory tracking and exportable label results.
- Parameters:
Dataset (pandas DataFrame) must be provided during object initialization.
No other parameters are required during instantiation.
- Attributes:
df (pd.DataFrame) – The input data with ‘x’, ‘y’ coordinates and features.
labelsDF (pd.DataFrame) – DataFrame containing ‘x’, ‘y’, and assigned cluster labels.
model (GaussianMixture) – The trained scikit-learn GaussianMixture model instance for reuse or further analysis.
Execution methods
Calling from a Python program
import pandas as pd from geoanalytics.clustering import Gaussianmixture df = pd.read_csv("input.csv") gm = Gaussianmixture(df) output = gm.clustering(n_components=3) labels_df = output[0] weights = output[1] centers = output[2] gm.save('GaussianMixtureLabels.csv')
Credits
This implementation was created by Raashika and revised by M.Charan Teja under the guidance of Professor Rage Uday Kiran.
- getMemoryRSS()[source]
Prints the memory usage (RSS) of the process in kilobytes.
- getMemoryUSS()[source]
Prints the memory usage (USS) of the process in kilobytes.
- getRuntime()[source]
Prints the total runtime of the clustering algorithm.
- run(n_components=4, max_iters=100, covariance_type='full', init_params='kmeans', random_state=0)[source]
Performs Gaussian Mixture Model clustering on the feature columns.
- Parameters:
n_components – Number of Gaussian components (clusters) to use.
max_iters – Maximum number of iterations allowed during EM algorithm.
covariance_type – Type of covariance parameters (‘full’, ‘tied’, ‘diag’, ‘spherical’).
init_params – Initialization method (‘kmeans’ or ‘random’).
random_state – Random seed to ensure reproducibility.
- Returns:
Tuple of (DataFrame with labels, array of component weights, array of component means).
- save(outputFileLabels='GaussianMixtureLabels.csv', outputFileWeights='GaussianMixtureWeights.csv', outputFileMeans='GaussianMixtureMeans.csv')[source]
Saves labels, weights, and means to separate CSV files.