OverlapScore
- class geoanalytics.scoreCalculator.OverlapScore.OverlapScore(TrainDF, TopkDF, startBandTrainDF=2, startBandTopkDF=2)[source]
Bases:
objectAbout this algorithm
- Description:
OverlapScore quantifies the cluster overlap between two datasets using KMeans clustering. It helps evaluate how well a top-k retrieved set aligns with the training dataset in the embedding space, by checking the agreement of cluster assignments.
- Parameters:
TrainDF (pd.DataFrame): The original training dataset.
TopkDF (pd.DataFrame): The retrieved top-k dataset.
startBandTrainDF (int): Column index from which to start using features in TrainDF (default: 2).
startBandTopkDF (int): Column index from which to start using features in TopkDF (default: 2).
- Attributes:
TrainDF (np.ndarray) – Sliced feature matrix from the training dataset.
TopkDF (np.ndarray) – Sliced feature matrix from the top-k dataset.
Execution methods
Calling from a Python program
import pandas as pd from geoanalytics.scoreCalculator import OverlapScore train_df = pd.read_csv("train.csv") topk_df = pd.read_csv("topk.csv") overlap = OverlapScore(train_df, topk_df, startBandTrainDF=2, startBandTopkDF=2) score = overlap.run(n_clusters=3)
Credits
This implementation was created by Raashika and revised by M. Charan Teja under the guidance of Professor Rage Uday Kiran.