Agglomerative

class geoanalytics.clustering.Agglomerative.Agglomerative(dataframe)[source]

Bases: object

About this algorithm

Description:

Agglomerative Clustering is a hierarchical clustering technique that recursively merges the closest pairs of clusters. This wrapper applies agglomerative clustering on feature-rich multidimensional data and supports runtime and memory usage tracking, along with label export.

Parameters:

Dataset (pandas DataFrame) must be provided during object initialization.
Clustering parameters can be passed to the run method.

Attributes:

df (pd.DataFrame) – The input data with ‘x’, ‘y’ coordinates and features.
labelsDF (pd.DataFrame) – DataFrame containing ‘x’, ‘y’, and assigned cluster labels.
startTime, endTime (float) – Variables to track clustering execution time.
memoryUSS, memoryRSS (float) – Memory usage of the clustering process in kilobytes.

Execution methods

Calling from a Python program

import pandas as pd

from geoanalytics.clustering import Agglomerative

df = pd.read_csv("input.csv")

ag = Agglomerative(df)

labels_df = ag.run(n_clusters=4)

ag.getRuntime()
ag.getMemoryUSS()
ag.getMemoryRSS()

ag.save('AgglomerativeLabels.csv')

Credits

This implementation was created by Raashika and revised by M.Charan Teja under the guidance of Professor Rage Uday Kiran.

getMemoryRSS()[source]: Prints the memory usage (RSS) of the process in kilobytes.

getMemoryUSS()[source]: Prints the memory usage (USS) of the process in kilobytes.

getRuntime()[source]: Prints the total runtime of the clustering algorithm.

run(n_clusters=4)[source]

Executes Agglomerative Clustering algorithm.

Parameters:: n_clusters – int, number of clusters to form (default: 4)
Returns:: labelsDF (pd.DataFrame) with columns [‘x’, ‘y’, ‘labels’]

save(outputFileLabels='AgglomerativeLabels.csv')[source]

Saves the clustering result with labels to a CSV file.

Parameters:: outputFileLabels – str, filename for saving labels (default: ‘AgglomerativeLabels.csv’)