Homepage

DADApy is a Python package for the characterization of manifolds in high-dimensional spaces.

Homepage

For more details and tutorials, visit the homepage at: https://dadapy.readthedocs.io/

Quick Example

import numpy as np
from dadapy.data import Data

# Generate a simple 3D gaussian dataset
X = np.random.normal(0, 1, (1000, 3))

# initialize the "Data" class with the set of coordinates
data = Data(X)

# compute distances up to the 100th nearest neighbor
data.compute_distances(maxk=100)

# compute the intrinsic dimension using 2nn estimator
id_twonn, id_error, id_distance = data.compute_id_2NN()

# compute the intrinsic dimension up to the 64th nearest neighbors using Gride
id_gride_list, id_error_list, id_distance_list = data.return_id_scaling_gride(range_max=64)

# compute the density using PAk, a point adaptive kNN estimator
log_den, log_den_error = data.compute_density_PAk()

# find the peaks of the density profile through the ADP algorithm
cluster_assignment = data.compute_clustering_ADP()

# compute the neighborhood overlap with another dataset
X2 = np.random.normal(0, 1, (1000, 5))
overlap_x2 = data.return_data_overlap(X2)

# compute the information imbalance with another dataset
ii_x2 = data.return_information_imbalance(X2)

# compute the neighborhood overlap with a set of labels
labels = np.repeat(np.arange(10), 100)
overlap_labels = data.return_label_overlap(labels, k=10)

The Data class is just container of classes. If you need to work with a specific module
you can equivalently import it directly.

import numpy as np
from dadapy import IdEstimation

# Generate a simple 3D gaussian dataset
X = np.random.normal(0, 1, (1000, 3))

# initialize the "Data" class with the set of coordinates
ie = IdEstimation(X)

# compute the intrinsic dimension up to the 64th nearest neighbors using Gride
id_list, id_error_list, id_distance_list = ie.return_id_scaling_gride(range_max=64)

This allows to work more naturally with data comparison methods.

import numpy as np
from dadapy import NeighborhoodOverlap

# Generate a simple 3D gaussian dataset
X = np.random.normal(0, 1, (1000, 3))
X2 = np.random.normal(0, 1, (1000, 5))
labels = np.repeat(np.arange(10), 100)

# compute the neighborhood overlap with another dataset
no = NeighborhoodOverlap(X, X2)
overlap_x2 = no.return_data_overlap()

# compute the neighborhood overlap with a set of labels
no = NeighborhoodOverlap(X, labels = labels)
overlap_x2 = no.return_label_overlap(k=10)

Currently implemented algorithms

Intrinsic dimension estimators
Two-NN estimator

Facco et al., Scientific Reports (2017)
Gride estimator

Denti et al., Scientific Reports (2022)
I3D estimator (for both continuous and discrete spaces)

Macocco et al., Physical Review Letters (2023)
BID estimator

Acevedo et al., Nature Communications Physics (2025)
Density estimators
kNN estimator
k*NN estimator (kNN with an adaptive choice of k)
PAk estimator

Rodriguez et al., JCTC (2018)
point-adaptive mean-shift gradient estimator

Carli et al., ArXiv (2024)
BMTI estimator

Carli et al., ArXiv (2024)
Density peaks clustering methods
Density peaks clustering

Rodriguez and Laio, Science (2014)
Advanced density peaks clustering

d’Errico et al., Information Sciences (2021)
k-peak clustering

Sormani, Rodriguez and Laio, JCTC (2020)
Manifold comparison tools
Neighbourhood overlap

Doimo et al., NeurIPS (2020)
Information imbalance

Glielmo et al., PNAS Nexus (2022)
Feature selection and weighting tool
Differentiable Information Imbalance

Wild et al., Nature Communications (2025)
Causal analysis tools
Imbalance Gain

Del Tatto et al., PNAS (2024)
Community causal graph

Allione et al., arXiv (2025)

Installation

The package is compatible with the Python versions 3.10, 3.11, 3.12, 3.13, and 3.14. We currently only support Unix-based systems, including Linux and macOS. For Windows machines, we suggest using the Windows Subsystem for Linux (WSL).

The package requires numpy, scipy, scikit-learn, jax, jaxlib, and matplotlib for the visualizations.

The package contains Cython-generated C extensions that are automatically compiled during installation.

The latest release is available through pip:

pip install dadapy

To install the latest development version, clone the source code from GitHub and install it with pip as follows:

pip install git+https://github.com/sissa-data-science/DADApy

Alternatively, if you'd like to modify the implementation of some function locally, you can download the repository and install the package with:

git clone https://github.com/sissa-data-science/DADApy.git
cd DADApy
python setup.py build_ext --inplace
pip install .

The methods of the classes DiffImbalance and CausalGraph can be run on a GPU, using a suitable installation of JAX on a GPU platform. The code has been tested using JAX v0.4.30 with CUDA 12, which can be installed with:

pip install --upgrade "jax[cuda12_pip]==0.4.30" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

For more information on the installation of the JAX library on GPUs, see the official repository.

Citing DADApy

A description of the package is available here.

Please consider citing it if you found this package useful for your research:

@article{dadapy,
    title = {DADApy: Distance-based analysis of data-manifolds in Python},
    journal = {Patterns},
    pages = {100589},
    year = {2022},
    issn = {2666-3899},
    doi = {https://doi.org/10.1016/j.patter.2022.100589},
    url = {https://www.sciencedirect.com/science/article/pii/S2666389922002070},
    author = {Aldo Glielmo and Iuri Macocco and Diego Doimo and Matteo Carli and Claudio Zeni and Romina Wild and Maria d’Errico and Alex Rodriguez and Alessandro Laio},
    }

Name		Name	Last commit message	Last commit date
Latest commit History 1,053 Commits
.github		.github
dadapy		dadapy
docs		docs
examples		examples
logo		logo
tests		tests
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
AUTHORS.md		AUTHORS.md
CONTRIBUTING.md		CONTRIBUTING.md
HISTORY.md		HISTORY.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Homepage

Quick Example

Currently implemented algorithms

Intrinsic dimension estimators

Density estimators

Density peaks clustering methods

Manifold comparison tools

Feature selection and weighting tool

Causal analysis tools

Installation

Citing DADApy

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Homepage

Quick Example

Currently implemented algorithms

Intrinsic dimension estimators

Density estimators

Density peaks clustering methods

Manifold comparison tools

Feature selection and weighting tool

Causal analysis tools

Installation

Citing DADApy

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages