Bug#1053490: ITP: python-hdbscan -- Clustering based on density with variable density clusters
Package: wnpp
Severity: wishlist
Owner: Edward Betts <edward@4angle.com>
X-Debbugs-Cc: debian-devel@lists.debian.org, debian-python@lists.debian.org
* Package name : python-hdbscan
Version : 0.8.33
Upstream Author : Leland McInnes <leland.mcinnes@gmail.com>
* URL : https://github.com/scikit-learn-contrib/hdbscan
* License : BSD-3-clause
Programming Lang: Python
Description : Clustering based on density with variable density clusters
HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with
Noise) is a powerful clustering algorithm designed for discovering meaningful
patterns in data. Unlike traditional clustering methods, HDBSCAN excels at
identifying clusters of varying densities, making it particularly suitable for
complex datasets where traditional approaches may struggle.
.
HDBSCAN operates by performing DBSCAN clustering over a range of epsilon
values and then integrates these results to find a clustering that offers the
best stability across the range. HDBSCAN is able to determine clusters with
little or no parameter tuning. The primary parameter, minimum cluster size, is
intuitive and straightforward to select, making it ideal for exploratory data
analysis.
.
Key Features:
- Robust to parameter selection: HDBSCAN returns meaningful clusters with
minimal parameter tuning.
- Support for varying densities: It can find clusters of varying densities,
unlike DBSCAN.
- High performance: HDBSCAN is significantly faster than many clustering
algorithms, making it suitable for large datasets.
- Comprehensive documentation: Tutorials and documentation are available on
ReadTheDocs, making it easy to get started.
I plan to maintain this package as part of the Python team.
Reply to: