AffinityPropagationWrapper

class geoanalytics.clustering.AffinityPropagationWrapper.AffinityPropagationWrapper(dataframe)[source]

Bases: object

About this algorithm

Description:

Affinity Propagation is a message-passing-based clustering algorithm that identifies exemplars (cluster centers) among the data points and forms clusters around these exemplars. This wrapper automatically tracks runtime and memory usage, and supports saving clustering outputs to CSV.

Parameters:
  • Dataset (pandas DataFrame) must be provided during object initialization.

  • No other parameters are required during instantiation.

Attributes:
  • df (pd.DataFrame) – The input data with ‘x’, ‘y’ coordinates and features.

  • labelsDF (pd.DataFrame) – DataFrame containing ‘x’, ‘y’, and assigned cluster labels.

  • centers (np.ndarray) – Coordinates of the identified cluster centers.

  • startTime, endTime (float) – Track clustering runtime.

  • memoryUSS, memoryRSS (float) – Memory usage statistics in kilobytes.

Execution methods

Calling from a Python program

import pandas as pd

from geoanalytics.clustering import AffinityPropagationWrapper

df = pd.read_csv("input.csv")

ap = AffinityPropagationWrapper(df)

output = ap.run()

labels_df = output[0]

centers = output[1]

ap.getRuntime()
ap.getMemoryUSS()
ap.getMemoryRSS()

ap.save('AffinityLabels.csv', 'AffinityCenters.csv')

Credits

This implementation was created by Raashika and revised by M.Charan Teja under the guidance of Professor Rage Uday Kiran.

getMemoryRSS()[source]

Prints the memory usage (RSS) of the process in kilobytes.

getMemoryUSS()[source]

Prints the memory usage (USS) of the process in kilobytes.

getRuntime()[source]

Prints the total runtime of the clustering algorithm.

run(damping=0.5, max_iter=300, convergence_iter=15, affinity='euclidean', random_state=None, preference=None)[source]

Executes the affinity propagation clustering algorithm.

Parameters:
  • damping (float) – Damping factor for affinity propagation.

  • max_iter (int) – Maximum number of iterations.

  • convergence_iter (int) – Number of iterations with no change to declare convergence.

  • affinity (str) – Metric used to compute the affinity matrix (‘euclidean’ or other).

  • random_state (int or None) – Random seed for reproducibility.

  • preference (array_like or None) – Preferences for each point (higher values attract more clusters).

Returns:

DataFrame with ‘x’, ‘y’, and assigned labels; cluster centers.

Return type:

pd.DataFrame, np.ndarray

save(outputFileLabels='AffinityLabels.csv', outputFileCenters='AffinityCenters.csv')[source]

Saves the clustering results to CSV files.

Parameters:
  • outputFileLabels – str, filename for saving labels (default: ‘AffinityLabels.csv’)

  • outputFileCenters – str, filename for saving centers (default: ‘AffinityCenters.csv’)