OpticsClustering

class geoanalytics.clustering.OpticsClustering.OpticsClustering(dataframe)[source]

Bases: object

About this algorithm

Description:

OPTICS (Ordering Points To Identify the Clustering Structure) is a density-based clustering algorithm that builds a reachability graph and extracts clusters based on the reachability distance and density thresholds. It generalizes DBSCAN and handles varying-density clusters better. This wrapper provides memory and runtime tracking, and supports exporting cluster labels.

Parameters:
  • Dataset (pandas DataFrame) must be provided during object initialization.

  • Clustering hyperparameters can be passed to the run method.

Attributes:
  • df (pd.DataFrame) – The input data with ‘x’, ‘y’ coordinates and features.

  • labelsDF (pd.DataFrame) – DataFrame containing ‘x’, ‘y’, and assigned cluster labels.

  • startTime, endTime (float) – Variables to track clustering execution time.

  • memoryUSS, memoryRSS (float) – Memory usage of the clustering process in kilobytes.

Execution methods

Calling from a Python program

import pandas as pd

from geoanalytics.clustering import OpticsClustering

df = pd.read_csv("input.csv")

optics = OpticsClustering(df)

labels_df = optics.run(min_samples=5, eps=None)

optics.getRuntime()
optics.getMemoryUSS()
optics.getMemoryRSS()

optics.save('OpticsLabels.csv')

Credits

This implementation was created by Raashika and revised by M.Charan Teja under the guidance of Professor Rage Uday Kiran.

getMemoryRSS()[source]

Prints the memory usage (RSS) of the process in kilobytes.

getMemoryUSS()[source]

Prints the memory usage (USS) of the process in kilobytes.

getRuntime()[source]

Prints the total runtime of the clustering algorithm.

run(min_samples=5, eps=None)[source]

Executes OPTICS clustering algorithm.

Parameters:
  • min_samples – int, minimum number of samples in a neighborhood (default: 5)

  • eps – float or None, maximum distance between samples (optional)

Returns:

labelsDF (pd.DataFrame) with columns [‘x’, ‘y’, ‘labels’]

save(outputFileLabels='OpticsLabels.csv')[source]

Saves the clustering result with labels to a CSV file.

Parameters:

outputFileLabels – str, output filename (default: ‘OpticsLabels.csv’)